[IEEE 2012 IEEE International Conference on Systems, Man and Cybernetics - SMC - Seoul, Korea...

6
A Hyper-heuristic Clustering Algorithm Chun-Wei Tsai , Huei-Jyun Song , and Ming-Chao Chiang Department of Applied Geoinformatics, Chia Nan University of Pharmacy & Science, Tainan 71710, Taiwan, R.O.C. Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan, R.O.C. E-mail:[email protected], [email protected], [email protected] Abstract—The so-called heuristics have been widely used in solving combinatorial optimization problems because they pro- vide a simple but effective way to find an approximate solution. These technologies are very useful for users who do not need the exact solution but who care very much about the response time. For every existing heuristic algorithm has its pros and cons, a hyper-heuristic clustering algorithm based on the diversity detection and improvement detection operators to determine when to switch from one heuristic algorithm to another is presented to improve the clustering result in this paper. Several well-known datasets are employed to evaluate the performance of the proposed algorithm. Simulation results show that the proposed algorithm can provide a better clustering result than the state-of-the-art heuristic algorithms compared in this paper, namely, k-means, simulated annealing, tabu search, and genetic k-means algorithm. Keywords-Hyper-heuristics; clustering problem; genetic k- means algorithm. I. I NTRODUCTION Finding an effective way to solve combinatorial optimiza- tion problems (COPs) is usually a very difficult problem, especially when the goal is to develop a general-purpose tool for such problems. Not only is the domain knowledge of COPs different from each other, the candidate solu- tions of most combinatorial optimization problems (e.g., the scheduling problem, the traveling salesman problem, or the clustering problem) is generally huge. The time a search algorithm takes depends to a large extent on the size of the dataset. When the COP in question is NP-hard or NP- complete, there is simply no effective way to guarantee that an optimal solution will be found in a reasonable time using a limited resource. Among all the COPs, the clustering problem has long been an active research subject because solution for it can be applied to many other problems, such as image and speech recognition, data compression, and data analysis [1]. The clustering problem is generally defined as: Given a set of patterns X = {x 1 ,x 2 ,...,x n } in d-dimensional space, partition the set X into k clusters Π= {π 1 2 ...,π k } that minimize some predefined criteria. A good example is the sum of squared error (SSE) 1 that is aimed to minimize the distance between all the patterns and the centroids to which 1 which is defined as SSE = P k i=1 P xπ i x c i 2 where k denotes the number of clusters in the dataset; π i the i-th cluster; c i the centroid of π i ; x a pattern in π i . they belong. Because the clustering problem is NP-hard [2], it is expected that an extraordinary amount of computation time is needed to find the optimal solution. Hence, heuristics [3] have been widely used in finding an approximate solution of such problem in a reasonable amount of time. The basic idea of heuristics is the so-called “educated guess,” which provides a fast way to find the possible solutions compared with a brute-force search algorithm. Different from the brute-force or deterministic local search algorithms, a very simple example of heuristic algorithms is simulated annealing (SA). The underlying idea of SA is to occasionally accept a solution that is worse than the current solution. SA was used in [4] to automatically determine the number of clusters and fuzzy membership values for clustering. Like the SA, tabu search (TS) [5] is a single- solution-based heuristic algorithm (SSBHA). Unlike the SA, TS uses a tabu list to avoid searching the same regions that have been visited recently. Different from the SSBHA, the population-based heuristic algorithm (PBHA) uses multiple search directions to find a set of solutions at each iteration. Genetic algorithm (GA) is a PBHA that is aimed at global search. GA was utilized in [6] to solve the clustering problem. A later study [7] used centroid encoding for the genetic algorithm (GA) and arbitrary blank symbol to automatically determine the number of clusters. Recently, Omran et al. [8] employed a high performance heuristic algorithm, particle swarm opti- mization (PSO), for the image classification problem, which uses binary encoding to overcome the problem of PSO being able to be used for continuous solution only. In addition to combining one or more heuristic algorithms into a single search algorithm (called hybrid-heuristics), the hyper-heuristic algorithm [9], an important branch of heuristics, provides an alternative way to integrate multiple heuristic algorithms into a single search algorithm. The difference is in that most hybrid-heuristic algorithms employ more than one heuristic algorithm at each iteration whereas the hyper-heuristic algorithm uses only one heuristic al- gorithm selected from the candidate pool for search at each iteration. In this paper, a hyper-heuristic algorithm is presented to provide a better clustering result compared with a single heuristic algorithm or a hybrid-heuristic algorithm. The remainder of the paper is organized as follows. Section II gives a brief introduction to heuristics and hyper- 2012 IEEE International Conference on Systems, Man, and Cybernetics October 14-17, 2012, COEX, Seoul, Korea 978-1-4673-1714-6/12/$31.00 ©2012 IEEE 2839

Transcript of [IEEE 2012 IEEE International Conference on Systems, Man and Cybernetics - SMC - Seoul, Korea...

A Hyper-heuristic Clustering Algorithm

Chun-Wei Tsai∗, Huei-Jyun Song†, and Ming-Chao Chiang†∗Department of Applied Geoinformatics, Chia Nan University of Pharmacy & Science, Tainan 71710, Taiwan, R.O.C.†Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan, R.O.C.

E-mail:[email protected], [email protected], [email protected]

Abstract—The so-called heuristics have been widely used insolving combinatorial optimization problems because they pro-vide a simple but effective way to find an approximate solution.These technologies are very useful for users who do not need theexact solution but who care very much about the response time.For every existing heuristic algorithm has its pros and cons,a hyper-heuristic clustering algorithm based on the diversitydetection and improvement detection operators to determinewhen to switch from one heuristic algorithm to another ispresented to improve the clustering result in this paper. Severalwell-known datasets are employed to evaluate the performanceof the proposed algorithm. Simulation results show that theproposed algorithm can provide a better clustering result thanthe state-of-the-art heuristic algorithms compared in this paper,namely, k-means, simulated annealing, tabu search, and genetick-means algorithm.

Keywords-Hyper-heuristics; clustering problem; genetic k-means algorithm.

I. INTRODUCTION

Finding an effective way to solve combinatorial optimiza-

tion problems (COPs) is usually a very difficult problem,

especially when the goal is to develop a general-purpose

tool for such problems. Not only is the domain knowledge

of COPs different from each other, the candidate solu-

tions of most combinatorial optimization problems (e.g., the

scheduling problem, the traveling salesman problem, or the

clustering problem) is generally huge. The time a search

algorithm takes depends to a large extent on the size of

the dataset. When the COP in question is NP-hard or NP-

complete, there is simply no effective way to guarantee that

an optimal solution will be found in a reasonable time using

a limited resource.

Among all the COPs, the clustering problem has long

been an active research subject because solution for it can

be applied to many other problems, such as image and

speech recognition, data compression, and data analysis [1].

The clustering problem is generally defined as: Given a set

of patterns X = {x1, x2, . . . , xn} in d-dimensional space,

partition the set X into k clusters Π = {π1, π2 . . . , πk} that

minimize some predefined criteria. A good example is the

sum of squared error (SSE)1 that is aimed to minimize the

distance between all the patterns and the centroids to which

1which is defined as SSE =Pk

i=1

P∀x∈πi

‖x−ci‖2 where k denotesthe number of clusters in the dataset; πi the i-th cluster; ci the centroid ofπi; x a pattern in πi.

they belong. Because the clustering problem is NP-hard [2],

it is expected that an extraordinary amount of computation

time is needed to find the optimal solution. Hence, heuristics

[3] have been widely used in finding an approximate solution

of such problem in a reasonable amount of time.

The basic idea of heuristics is the so-called “educated

guess,” which provides a fast way to find the possible

solutions compared with a brute-force search algorithm.

Different from the brute-force or deterministic local search

algorithms, a very simple example of heuristic algorithms is

simulated annealing (SA). The underlying idea of SA is to

occasionally accept a solution that is worse than the current

solution. SA was used in [4] to automatically determine

the number of clusters and fuzzy membership values for

clustering. Like the SA, tabu search (TS) [5] is a single-

solution-based heuristic algorithm (SSBHA). Unlike the SA,

TS uses a tabu list to avoid searching the same regions that

have been visited recently.

Different from the SSBHA, the population-based heuristic

algorithm (PBHA) uses multiple search directions to find a

set of solutions at each iteration. Genetic algorithm (GA)

is a PBHA that is aimed at global search. GA was utilized

in [6] to solve the clustering problem. A later study [7]

used centroid encoding for the genetic algorithm (GA)

and arbitrary blank symbol to automatically determine the

number of clusters. Recently, Omran et al. [8] employed a

high performance heuristic algorithm, particle swarm opti-

mization (PSO), for the image classification problem, which

uses binary encoding to overcome the problem of PSO being

able to be used for continuous solution only.

In addition to combining one or more heuristic algorithms

into a single search algorithm (called hybrid-heuristics),

the hyper-heuristic algorithm [9], an important branch of

heuristics, provides an alternative way to integrate multiple

heuristic algorithms into a single search algorithm. The

difference is in that most hybrid-heuristic algorithms employ

more than one heuristic algorithm at each iteration whereas

the hyper-heuristic algorithm uses only one heuristic al-

gorithm selected from the candidate pool for search at

each iteration. In this paper, a hyper-heuristic algorithm is

presented to provide a better clustering result compared with

a single heuristic algorithm or a hybrid-heuristic algorithm.

The remainder of the paper is organized as follows.

Section II gives a brief introduction to heuristics and hyper-

2012 IEEE International Conference on Systems, Man, and Cybernetics October 14-17, 2012, COEX, Seoul, Korea

978-1-4673-1714-6/12/$31.00 ©2012 IEEE 2839

heuristics. Section V describes in detail the proposed algo-

rithm for clustering. Section IV gives the simulation results

of the proposed algorithm. Conclusion is drawn in Section V.

II. RELATED WORK

A. Hybrid-Heuristic Algorithms for Clustering

Using a single heuristic algorithm may not provide a sat-

isfactory result because every heuristic has its unique strong

points and weak points. For instance, some heuristics are

good at global search while some are good at local search. In

order to improve the clustering result, many studies combine

two or more heuristic algorithms into a single algorithm for

solving the clustering problem, which is usually referred

to as the hybrid-heuristic algorithm [10]. A well-known

example is the genetic k-means algorithm (GKA) [11] which

uses one-step k-means as the crossover operator of genetic

algorithm for clustering. The basic concept of GKA is to use

genetic algorithm to guide global search directions and use

k-means to fine-tune the clustering results. Although there

are many successful examples in using heuristic methods,

hybrid methods, and other search techniques for clustering,

still two important problems need to be solved: (1) some

heuristic algorithms are designed for a specific purpose, that

is, most of the heuristic algorithms are “problem-specific,”

which makes them ineffective for combinatorial optimization

problems of a different kind and (2) the computation cost

increases as more heuristic algorithms or operators are

added.

B. Hyper-Heuristic Algorithms

Because there are many successful examples in using

the hyper-heuristic algorithm to solve the combinatorial

optimization problem, several reviews of hyper-heuristic

algorithms [9], [12], [13] are given to differentiate them and

to make them easier to use. Although both hybrid-heuristics

and hyper-heuristics combine two or more heuristic algo-

rithms into a single search algorithm, the main difference

between the two types of algorithms, as shown in Fig. 1,

is in that for most hyper-heuristic algorithms, only one of

the low-level heuristic algorithms is performed each iteration

whereas for the hybrid-heuristic algorithm, all the low-level

heuristic algorithms are performed each iteration.

The naive way to pick a heuristic algorithm from the

candidate pool is to rely on a random process [9]. A more

advanced method to determine which low-level heuristic

algorithm is to be used at each iteration was presented in

[14], where both greedy and random strategies are employed

to choose the low-level heuristic algorithm. Using “heuristics

to select heuristics” is , of course, another way to select the

low-level heuristic algorithm at each iteration. For example,

hill-climbing [15] and genetic algorithm [16] were used to

select the low-level heuristic algorithm. A later study [17]

employed a reinforcement learning method to dynamically

Input

Tabu Search

Simulated Annealing

Genetic Algorithm

Tabu Search

Simulated Annealing

Genetic Algorithm

···

Output

···

t

t + 1

(a) Hybrid-heuristics

···

Simulated AnnealingTabu Search Genetic Algorithmt

Input

Tabu Search Genetic Algorithm

Output

t + te Simulated Annealing

···

···

(b) Hyper-heuristics

Figure 1. A simplified example showing that the main difference betweenhybrid-heuristics and hyper-heuristics is in that for the hyper-heuristicalgorithm, only one of the low-level heuristic algorithms is performedeach iteration whereas for the hybrid-heuristic algorithm, all the low-levelheuristic algorithms are performed each iteration.

determine the number of iterations the low-level heuristic

should be performed.

III. THE PROPOSED METHOD

In this section, we present a novel hyper-heuristic al-

gorithm, called hyper-heuristic clustering algorithm with

diversity detection operators (HHCAD), to solve the clus-

tering problem. The proposed hyper-heuristic algorithm is

designed to solve the clustering problem by using both

single-solution-based and population-based heuristic algo-

rithms as the low-level search algorithms. As far as this

paper is concerned, four heuristics—namely, k-means, tabu

search, simulated annealing, and genetic algorithm—are

used as the low-level heuristic algorithms. In addition, a

simple random method, an improvement detection method

(see Section III-C), and a diversity detection method (see

Section III-D) are used to determine when to change the

low-level heuristic algorithm.

2840

1. T ← 0.2. create the initial solution by using a random process.3. do4. Randomly select a heuristic algorithm Ai from the candidate pool.5. t ← 0.6. do7. Use Ai to classify the patterns.8. T ← T + 1.9. t ← t + 1.

10. while (t ≤ te and ψ(Ai, I,D) = true)11. while (T ≤ Te)12. output the best-so-far as the clustering result.

Figure 2. Outline of the proposed algorithm.

A. hyper-heuristic clustering algorithm with diversity detec-tion operators

The proposed algorithm is as outlined in Fig. 2. However,

before we proceed to describe in detail the proposed algo-

rithm, here is how the solutions are encoded by the proposed

algorithm. In order to share and pass the solutions found by

different low-level heuristic algorithms on to the others, the

proposed algorithm encodes the centroids of each clustering

result. That is, the i-th solution or chromosome, denoted si,

is represented as si = 〈c1, c2, . . . , ck〉 where cj denotes the

centroid of the j-th cluster centroid and k the number of

clusters.

On line 1, the counter T is reset; on line 2, the initial

solutions are created by using a random process. Lines 3

to 9 describe in detail how the proposed algorithm works.

HHCAD first selects at random a low-level heuristic algo-

rithm Ai from the candidate pool. As noted previously, as far

as this study is concerned, the low-level heuristic algorithms

used are k-means, TS, SA, and GA. Next, the counter t is

reset. Then, the selected low-level heuristic algorithm Ai

is performed to cluster the dataset in question by using

the solutions obtained by the previous low-level heuristic

algorithm, say Aj , as the initial solutions. Only when the

maximum number of iterations for the low-level heuristic

algorithm is met or the detection operators return false, will

the HHCAD exit the loop of the current low-level heuristic

algorithm Ai. Then, the control returns to the hyper-heuristic

algorithm at which the loop on lines 3 to 9 will be performed

repeatedly until the termination condition is reached. How

the diversity of and the improvement made by the low-level

heuristic algorithms are detected by the proposed algorithm

is as described in Eq. (1). That is, when to switch from

one search algorithm to another is determined by the result

of ψ(Ai, I,D) where the parameters Ai, I, and D denote,

respectively, the low-level heuristic algorithm selected, the

improvement detection operator, and the diversity detection

operator. For the SSBHAs, only I is used whereas for the

PBHAs, both I and D are used.

ψ(Ai, I,D) =

8><>:

true if Ai ∈ S and I = true,

true if Ai ∈ P and I = true and D = true,

false otherwise.

(1)

where S denotes the set of SSBHAs, P denotes the set of

PBHAs, “I = true” denotes I returns true, and “D = true”

denotes D returns true.

Also note that in Fig. 2, Te denotes the maximum number

of iterations to be performed by the proposed algorithm;

0 < te < Te how many iterations the low-level heuristic

algorithm will be performed once it is selected by HHCAD;

T the number of iterations performed so far by the proposed

algorithm; and t the number of iterations performed so far

by the low-level heuristic algorithm.

B. The Information Sharing

To retain the search results of the heuristic algorithm Ai

so that they can be used by other heuristic algorithms or by

Ai itself for another round, HHCAD uses a lookup table

to record these results, such as best-so-far solution, current

solution(s), tabu list, and annealing temperature. After the

heuristic algorithm Aj finished searching the clustering

result, information in the lookup table will be updated so that

it can be shared and passed to another heuristic algorithm Ak

at a later iteration. For instance, the annealing temperature

T of SA will be updated to T ′ after SA finished searching

the clustering result, and it will be updated again if SA

is performed at later iterations. In summary, HHCAD will

transit from one state to another as Sl+1 = Ai(Sl) where

Ai is either k-means or TS or SA or GA; � the iteration

number, S the clustering results which are different from

the hybrid-heuristic algorithm which performs all of the low-

level heuristic algorithms, namely, k-means and TS and SA

and GA, in this case.

C. The Improvement Detection Operator

A random procedure is used to select the low-level

heuristic algorithm in this paper. Some of the low-level

heuristic algorithms are used to exploit the current solution

such as k-means while the others are used to explore the

other solutions such as GA. However, how to measure the

diversity of these heuristic algorithms is a difficult problem

because some of them are single-solution-based heuristic

algorithm (SSBHA), implying that they work on one and

only one solution at each iteration. Another problem is that

it may take more than one iteration to improve the result of

SSBHA.

A very simple approach is adopted to detect the timing

to change another heuristic to search the solutions. It only

checks the improvement for the solutions that found by

the low-level heuristic algorithms. If a heuristic algorithm

cannot improve the current solutions for a certain number of

iterations, denoted Pe, the improvement detection operator

of HHCAD I will return a false value to signal the pro-

posed algorithm to pick at random another low-level search

algorithm. Note again that the counters t and T are used

to count the number of iterations performed so far by the

2841

hyper-heuristic algorithm and the selected low-level heuristic

algorithm so that we know if the stop condition is reached.

D. The Diversity Detection Operator

Because PBHAs (i.e., GA is used in this research) use

multiple search directions at the same time, the search

diversity often has a strong impact on the final solution

for most of them for a very simple reason. The higher

the search diversity, the less likely the search algorithm

will fall into local optima. However, this is a very difficult

problem because too high the search diversity may render

the heuristic algorithm into a random search algorithm

while too low the search diversity may render the heuristic

algorithm into a greed search algorithm. Worse, a higher

search diversity does not guarantee a better result, though

it has a better chance to find a better result. A high search

diversity implies a longer search time.

In addition to the improvement detection operator I, a

very simple diversity detection operator D is presented for

PBHAS to automatically determine the “timing” to switch

from one search strategy to another in this paper. This

operator uses the diversity of the initial solution as the

threshold. The proposed algorithm computes the average and

standard deviation of the distances between the centroids

of the first solution S1 and the centroids of all the other

n− 1 solutions, i.e., S2, S3, . . . , Sn, at each iteration. Then,

if the average distance at the �-th iteration μ� is less than ω(the diversity threshold) and I returns true, the proposed

algorithm will randomly select another low-level search

algorithm. Note that ω here is defined as μ1 −3×σ1 where

μ1 and σ1 denotes, respectively, the average distance and the

standard deviation of the distances between the centroids of

the first solution and the centroids of all the other n − 1solutions at the first iteration.

E. An Example

KM TS SA ...

SA

Low-level heuristics

random choice

GA

Current solution

Improvement detection

select best onefrom population

KM TS SA ...

Low-level heuristics

random choice

GA

GA

Population

Improvement detection

Diversity detection

(a) (b)

Figure 3. A simplified example showing how the proposed algorithmworks. (a) SSBHAs; (b) PBHAs.

A simple example is given to illustrate how the proposed

algorithm works in this section. As depicted in Figs. 3(a)

and (b), there are two cases to deal with: one is for SSBHAs

while the other is for PBHAs. For both cases, first, a low-

level heuristic algorithm is randomly chosen from the pool

of low-level heuristic algorithms. If the low-level heuristic

algorithm selected is an SSBHA, say, simulation annealing

(SA), then the improvement detection operator will be used

to check the improvement, as shown in Fig. 3(a). In this

case, the diversity detection operator is not used. However,

if the heuristic algorithm selected is a PBHA, then both the

improvement detection operator and the diversity detection

operator are used, as shown in Fig. 3(b). For instance, let

us assume that the low-level heuristic algorithm selected is

genetic algorithm.

c12 c13

c21 c22 c23

c31 c32 c33

c41 c42 c43

c51 c52 c53

S3

S5

S2

S4

S1

S3

S4

S5

S2c31

c12

c43

c51

c53

c52

c22

c42

c23

c33

c13

c21

c11

c41

c32

S1 c11

Figure 4. A simplified example showing that solutions in the populationare very similar to each other when the low-level heuristic algorithm is aPBHA.

Let us further assume that the solutions in the population

at the �-th iteration are S1, S2, S3, S4, and S5 each of

which contains three centroids, denoted ci1, ci2, and ci3,

i = 1, 2, 3, 4, 5, as depicted in Fig. 4. The diversity detection

operator will compute the average and standard deviation of

the distances between the centroids of the first solution S1

and all the other 4 solutions S2, S3, S4, and S5 and then

compare the average distance μ� with the diversity threshold

ω. If μ� is less than ω, then the improvement detection

operator I will be invoked to check the improvement. If

I returns true, then the proposed algorithm will randomly

select another low-level search algorithm.

IV. EXPERIMENTAL RESULTS

A. Parameter Settings and Datasets

Table ITHE PARAMETERS OF HEURISTIC

Algorithm Parameters

SA initial temperature = 1.0, T = (1− 0.001)�× T0

TS list size = 10GA crossover rate = 0.8; mutation rate = 0.05

GKA mutation rate = 0.05

2842

The empirical analysis was conducted on a PC with

2.50GHz Intel Core 2 Quad CPU Q8300 and 4GB of mem-

ory using Fedora 15 running Linux 2.6.42.9-1.fc15.x86 64,

and the programs are written in C++ and compiled using

g++. To evaluate the performance of HHCAD for the clus-

tering problem, we compare it with k-means [18], simulated

annealing [19], tabu search [13], and genetic k-means algo-

rithm (GKA) [11].

Several different kinds of datasets from UCI [20] and

[21]—579, ecoli, glass, haberman, iris, and wine—are used

to evaluate the performance of these algorithms. These

datasets have different characteristics. For instance, abalone

has the largest number of clusters and patterns. Another

example is iris that is not linearly separable.

Each simulation is carried out for 30 runs, and the

maximum number of iterations of each run is 500. The

population size is set to 10 for GA-based algorithms, and

10 candidate solutions are constructed at each iteration for

SA and TS. The number of iterations te for the low-level

heuristic algorithm is set equal to 20, Pe is set equal to 3.

The other parameter settings of the clustering algorithms in

this paper are given in Table I where T denotes temperature;

� the iteration number.

Table IITHE COMPARISON OF THE HEURISTIC ALGORITHM BY SSE.

DatasetsAlgorithm

k-means SA TS GKA HHCADiris 83.3035 85.3323 82.1332 78.9411 78.9408glass 20.7398 19.7115 20.4884 19.1403 18.9992579 360.5640 360.5640 360.7420 360.5640 360.5640breast 244.3400 244.3370 244.6190 244.3380 244.3370haberman 25.1880 25.1878 26.5403 25.1838 25.1838wine 49.4094 48.9351 49.1950 48.9409 48.9381ecoli 15.1483 14.4963 15.9077 13.9749 13.9200

B. Results

As shown in Table II, the simulation results of the pro-

posed algorithm show that it can provide a better result than

the other heuristic algorithms in terms of the SSE. Some of

the simulation results show that SA is worse than k-means.

But for complex datasets, SA provides a better result than k-

means, such as ecoli. Comparison of TS with k-means gives

the same results. According to our observation, without the

domain knowledge, these search algorithms cannot find the

results via the values of the objective function and the search

strategy; therefore, these two algorithms cannot provide a

better solution than k-means for all the datasets.

The simulation results of Table II also show that GKA

provides a better result than k-means, SA, and TS in

terms of SSE for most of the datasets. In fact, the hybrid-

heuristic algorithm (GKA) and the hyper-heuristic algorithm

(HHCAD) all employ GA to search for the global directions

while at the same time relying on the k-means to fine-tune

the solutions. The differences are in that HHCAD also uses

SA and TS in the search process. The simulation results

show that the proposed algorithm can provide a better result

than GKA. According to our observation, both algorithms,

GKA and HHCAD, outperform k-means while the proposed

algorithm can even provide a better result than GKA when

applying to the four datasets. Note that because k-means

is a single-solution-based algorithm, it is very sensitive to

the initial solution; but GKA and HHCAD are not. The

simulation results show that these two clustering algorithms

are very effective in improving the clustering results.

V. CONCLUSION

In this paper, we present an effective algorithm for im-

proving the accuracy rate of a clustering algorithm. The

proposed hyper-heuristic algorithm is called HHCAD, which

is motivated by the high-level strategy of finding a suitable

heuristic to solve the clustering problem. In order to find a

better solution, we use an improvement detection operator

to determine the strategy of when to change the low-level

heuristic algorithm. This operator can eventually reduce the

probability of falling into a local optimum. The simulation

results show that the proposed algorithm outperforms the

other state-of-the-art meta-heuristic algorithms we compared

in this paper, namely, k-means, simulated annealing, tabu

search, genetic algorithm, and genetic k-means algorithm.

In summary, the main contributions of this paper are three-

fold: (1) the proposed algorithm can effectively choose a

suitable heuristic by using the hyper-heuristic strategy; (2)

the proposed algorithm can be easily combined with other

algorithms to improve their quality; and (3) the proposed

algorithm can maintain a high diversity of solutions and

reduce the probability of falling into local optima.

ACKNOWLEDGMENT

This work was supported in part by the National Science

Council of Taiwan, R.O.C., under Contracts NSC100-2218-

E-041-001-MY2, NSC98-2221-E-110-049, and NSC99-

2221-E-110-052.

REFERENCES

[1] R. Xu and D. W. II, “Survey of clustering algorithms,” IEEETransactions on Neural Netowrks, vol. 16, no. 3, pp. 645–678,2005.

[2] J. Kogan, Introduction to Clustering Large and High-Dimensional Data. New York, NY, USA: CambridgeUniversity Press, 2007.

[3] C. Blum and A. Roli, “Metaheuristics in combinatorialoptimization: Overview and conceptual comparison,” ACMComputing Surveys, vol. 35, no. 3, pp. 268–308, 2003.

[4] W. Yang, L. Rueda, and A. Ngom, “A simulated annealingapproach to find the optimal parameters for fuzzy clusteringmicroarray data,” in Students for Concealed Carry on Cam-pus, 2005, pp. 45–54.

[5] K. S. Al-Sultan, “A tabu search approach to the clusteringproblem,” Pattern Recognition, vol. 28, no. 9, pp. 1443–1451,1995.

2843

[6] U. Maulik and S. Bandyopadhyay, “Genetic algorithm-basedclustering technique,” Pattern Recognition, vol. 33, no. 9, pp.1455–1465, 2000.

[7] S. Bandyopadhyay and U. Maulik, “Genetic clustering forautomatic evolution of clusters and application to imageclassification,” Pattern Recognition, vol. 35, no. 6, pp. 1197–1208, 2002.

[8] M. G. Omran, A. A. Salman, and A. P. Engelbrecht, “Dy-namic clustering using particle swarm optimization withapplication in image segmentation,” Pattern Analysis & Ap-plications, vol. 8, no. 4, pp. 332–344, 2006.

[9] E. B. anc Michel Gendreau, M. Hyde, G. Kendall, G. Ochoa,E. Ozcan, and R. Qu, “Hyper-heuristics: A survey of the stateof the art,” Journal of the Operational Research Society, toappear, 2012.

[10] C. Blum, J. Puchinger, G. R. Raidl, and A. Roli, “Hybridmetaheuristics in combinatorial optimization: A survey,” Ap-plied Soft Computing, vol. 11, no. 6, pp. 4135–4151, 2011.

[11] K. Krishna and M. N. Murty, “Genetic k-means algorithm,”IEEE Transactions on Systems, Man, and Cybernetics, PartB, vol. 29, no. 3, pp. 433–439, 1999.

[12] P. Ross, “Hyper-heuristics,” in Search Methodologies, E. K.Burke and G. Kendall, Eds. Springer, 2005, pp. 529–556.

[13] F. Glover and G. Kochenberger, Eds., Handbook of Meta-Heuristics. Kluwer Academic Publishers, 2003.

[14] P. I. Cowling and K. Chakhlevitch, “Using a large set oflow level heuristics in a hyperheuristic approach to personnelscheduling,” in Evolutionary Scheduling, 2007, pp. 543–576.

[15] R. H. Storer, S. D. Wu, and R. Vaccari, “New search spacesfor sequencing problems with application to job shop schedul-ing,” Management Science, vol. 38, no. 10, pp. 1495–1509,1992.

[16] H. lan Fang, P. Ross, and D. Corne, “A promising geneticalgorithm approach to job-shop scheduling, rescheduling, andopen-shop scheduling problems,” in Proceedings of the FifthInternational Conference on Genetic Algorithms. MorganKaufmann, 1993, pp. 375–382.

[17] A. Nareyek, “Choosing search heuristics by non-stationary re-inforcement learning,” in Metaheuristics: Computer Decision-Making. Kluwer Academic Publishers, 2001, pp. 523–544.

[18] J. B. McQueen, “Some methods of classification and analysisof multivariate observations,” in Proceedings of the BerkeleySymposium on Mathematical Statistics and Probability, 1967,pp. 281–297.

[19] S. Kirkpatrick, D. G. Jr., and M. P. Vecchi, “Optimizationby simmulated annealing,” Science, vol. 220, no. 4598, pp.671–680, 1983.

[20] UCI-Machine Learning Repository, 2012, available at http://archive.ics.uci.edu/ml/.

[21] M. C. Su and H. T. Chang, “Fast self-organizing feature mapalgorithm,” IEEE Transactions on Neural Networks, vol. 11,no. 3, pp. 721–733, 2000.

2844