A CLUSTERING BASED MATE SELECTION FOR EVOLUTIONARY OPTIMIZATION

. The mate selection plays a key role in natural evolution process. Although a variety of mating strategies have been proposed in the commu- nity of evolutionary computation, the importance of mate selection has been ignored. In this paper, we propose a clustering based mate selection (CMS) strategy for evolutionary algorithms (EAs) . In CMS, the population is partitioned into clusters and only the solutions in the same cluster are chosen for oﬀspring reproduction. Instead of doing a whole new clustering process in each EA generation, the clustering iteration process is combined with the evolution iteration process. The combination of clustering and evolving pro-cesses beneﬁts EAs by saving the cost to discover the population structure. To demonstrate this idea, a CMS utilizing the k-means clustering method is proposed and applied to a state-of-the-art EA. The experimental results show that the CMS strategy is promising to improve the performance of the EA.

1. Introduction. In this paper, we consider the following continuous global optimization problem.
minf (x) s.t x ∈ [a i , b i ] n (1) where x = (x 1 , x 2 , · · · , x n ) ∈ R n is a decision variable vector; [a i , b i ] n defines the feasible range of the decision space and a i ≤ x i ≤ b i for i = 1, 2, · · · , n; and f : R n → R is the objective function.
The evolutionary algorithm (EA) is a type of heuristic optimization method, which is inspired by the natural evolution process [1]. It has become a major method to tackle (1). The major components in a general EA include a reproduction operator, and a selection operator. There is a key operation in natural evolution, named as mate selection, which chooses mating pairs or groups for breeding and plays a key role in sexual propagation. In EAs, a proper mate selection can also control the population convergence and diversity efficiently [6,12,13]. In the last decades some mating strategies have been proposed [16], including random mating, roulette wheel selection, truncate selection, tournament selection, gender based selection [7,15,19], niche based selection [2], dissociative selection [3,4], and some other methods [5,14,18]. Although these mating strategies have been proposed, it has not attracted much attention in the community of evolutionary computation [16]. The major reasons might be that (a) most of existing mating strategies need some problem specific control parameters or are computationally expensive, and (b) some widely used EAs work well by randomly choosing mating pairs. In this paper, we shall demonstrate that existing EAs can be improved by using properly designed mating strategies.
Statistical and machine learning (SML) techniques aim to extract information from data sets and transform it into an understandable pattern or structure for further use [8]. It is arguable that in EAs, an individual can be regarded as a training example, and its corresponding fitness value be a label. From the viewpoint of SML, the population of an EA forms a training data set. Therefore, SML techniques can be naturally applied to EAs to extract population information and guide the search. Some algorithms, such as estimation of distribution algorithms [10], and surrogate assisted evolutionary algorithms [9], are along this direction. Basically SML techniques are computationally expensive comparing to general EAs, which limits their usages. How to use SML techniques in EAs more efficiently is still an open question.
In this paper, we present a new way to combine SML methods with EAs. The basic idea is to iteratively call SML training step and EA evolving step. In the SML step, the obtained population is utilized to train a model that captures the population structure, and then in the EA step, the population structure information extracted in the SML step is used to guide the search. The combination of the SML iteration process and the EA iteration process can find and refine the population structure information and thus save the SML cost. In multi-objective evolutionary optimization, there are some works with similar idea [20]. However for scalarobjective optimization, this strategy is still new. Based on this idea, this paper proposes a clustering based mate selection (CMS) operator for EAs. In CMS, the population is partitioned into classes in each generation, and only the solutions in the same class are allowed to mate with each other. A CMS utilizing the kmeans [11] clustering method is proposed and applied to a state-of-the-art EA to show its advantages.
The rest of the paper is organized as follows. Section 2 presents the proposed CMS strategy. An EA integrated CMS is introduced in detail as well. Section 3 compares the proposed CMS strategy with some other mating strategies, and studies the influence of the control parameters. Finally, the paper is concluded in Section 4.

2.
Clustering based mate selection. A major challenge by applying SML techniques in EAs is on the high computational cost. This section introduces a clustering based mate selection (CMS) to address the challenge. The basic idea is to combine an EA with an iterative clustering method together. Take the k-means method as an example. In each generation (iteration) of the combined process, the clustering process uses the EA population to assign points and update cluster centers; and then based on the population partition, an EA chooses the parents in the same cluster to generate new trails solutions. It should be noted that the CMS assisted EA does not implement a clustering method in each generation. Instead it combines the clustering iteration with the EA iteration, and only one clustering iteration is implemented in each generation. Actually, the clustering process is only implemented several times sequentially along with the EA process. By this way, the computational cost is saved up. The major components of an iterative clustering method and an EA are combined in the CMS assisted EA (CMS-EA for short). A restart checking component is added to reinitialize the clustering. A major reason is to avoid getting into local optima in clustering.
In this paper, we use the CMS strategy to improve the performance of the composite differential evolution (CoDE) algorithm [21]. In CoDE, each solution produces three candidate offspring solutions by using three reproduction operators with randomly selected three control parameters, and chooses the best candidate as the offspring solution for updating. More details of CoDE can be found in [21]. The k-means clustering method is used to partition the population. In the following, we give the framework of the proposed approach, named as CMS-CoDE.

5.3
Replace x by y if f (y) < f (x). 6 If the stop condition is not satisfied, set g = g + 1 go to Step 2; otherwise, terminate and rerun the best solution found so far.
We would make the following comments to the above algorithm.
• In Step 2, the clustering process is re-initialized every G generations. The purpose is to prevent the clustering process tracking in local optima. • In Step 5.1, CoDE generates three candidate solutions for each solution x by where j = 1, 2, · · · , n, j rnd is a random index between 1 and n, rand returns a random number in [0.0, 1.0], x r1 − x r5 are randomly selected parents from the same cluster as x, and F and C r are two control parameters which are randomly selected from [F = 1.0, C r = 0.1], [F = 1.0, C r = 0.9], and [F = 0.8, C r = 0.2]. • In Step 6, CoDE terminates when the function evaluation exceeds a given threshold. • It is required that the minimum number of solutions in each cluster is 5 in the reproduction. If the number of solutions of a cluster is less than 5, the parents are selected from the whole population.
3. Comparison with other mating strategies. In this section, we compare the proposed CMS strategy with the following related strategies. Random mating strategy (RND): In this strategy, the parent solutions are randomly chosen from the whole population. The original CoDE algorithm actually utilizes this strategy.
Nearest neighbor strategy (NNS): For a solution x, this strategy selects the closest N K solutions to form a mating pool for x, and the parents are randomly choose from the mating pool, where N is the number of population size and K is the number of niche the population can be divided. Batch clustering based strategy (BCS): As the CMS strategy, this strategy also uses the k-means method to partition the population. The difference is that the whole clustering process is implemented in the beginning of each iteration. All the strategies are incorporated into CoDE algorithm as CMS-CoDE does.
To access the performance of the compared strategies, the first 20 instances from the CEC 2005 test suite [17] are used for the comparison study. The parameters in the experiments are as follows: the dimension of the instances is n = 30 for all the 20 problems, all the algorithms stop after 30 independent runs with a maximum of 300, 000 function evaluations (FES), the population size is N = 100 for all algorithms, the number of clusters is K = 3 in the k-means clustering, and the k-means restarts every G = 10 generations. To have a fair comparison, the Wilcoxon's rank sum test at a 0.05 significance level is conducted, and −, +, and ≈ in the tables indicate that the performance of the corresponding method is better than, worse than, and similar to that of CMS, respectively. All the algorithms are executed in the workstation.
3.1. Experimental results. The experimental results are given in Table 1, and the population partitions of a typical run for BCS and CMS strategies with CoDE are plotted in Fig. 1 on two instances.
RND vs. CMS: From Table 1, we can see that CMS-CoDE performs better than RND-CoDE on 15 test instances and worse than RND-CoDE on 3 test instances. This suggests that, since CMS restricts the mating parents to be selected from the similar individuals, the mating strategy can improve the algorithm performance significantly.
NNS vs. CMS: It is clear from Table 1 that, CMS-CoDE outperforms NNS-CoDE on 7 instances and is outperformed by NNS-CoDE on 10 instances. In NNS, the parents are the closest ones with similar characteristics around the solution. Thus it may help to converge to optima quickly especially when there is no variable dependency in the problems. In CMS, the parents are likely to be the closest ones but there is still some probability that the parents are far away from each. This may help to keep population diversity in a sense. And this might be the reason to explain the different performances between NNS and CMS. Although the results are comparable, we can see from the next section that CMS-CoDE has a lower theoretical computational complexity than NNS-CoDE. BCS vs. CMS: It is surprising that BCS-CoDE performs slightly better than CMS-CoDE. Table 1 shows that there is not much difference between the results obtained by the two algorithms on 13 out of 20 test instances. The reason might be that the clustering results of k-means highly depend on the initial cluster centers and k-means is very likely to converge to local optima. Therefore, the mis-clustering in k-means leads some randomness to the population and prevent the premature of the population. Although BCS-CoDE is slightly superior to CMS-CoDE, it has a higher computational complexity according to the analysis in the next section.
With respect to the population partitions for BCS and CMS strategies with CoDE on F2 and F3, it is easy to find from Fig. 1 that, for CMS-CoDE, during the continuous 9 generations, the population partitions change a little; but for BCS-CoDE, it presents quite different partition models at each generation. The reason might be that the clustering operation of CMS-CoDE only iterates one time but that of BCS-CoDE iterate for many times. Actually, the stable population should be more helpful to generate the solutions with high quality.
. Therefore, the total time complexity is O(N 2 · n + 1 K N 3 ). BCS: In k-means assignment step, the time complexity to assign each point to a cluster is O(N · K · n). In the update step, the time complexity is O((|C 1 | + |C 2 | + · · · + |C K |) · n) = O(Nṅ). Each solution will randomly select at most 5 parents from corresponding cluster and its time complexity is O(N ). Suppose the training steps in k-means is L. The total time complexity is O((K + 1) · N · n · L + N ). CMS: From the above analysis, we can see that the time complexity is O((K + 1) · n · N + N ).
It is reasonable to assume that K N and N L. The time complexities of NNS and BCS are much higher than those of RND and CMS. We can also see that although the time complexity of CMS is higher than that of RND, it is still linear according to N . We also record the CPU run time in Table 2 although it depends on the algorithm implementation. It clearly shows that the additional cost consumed by CMS is not much by comparing RND and CMS strategies. On some instances, the CPU time of CMS is slightly less than that of RND. On all the instances, BCS needs more time than CMS which is consistent with the above analysis.
3.3. Influence of control parameters. There are two control parameters in CMS: the number of clusters K, and the number of generations G to restart the clustering process. This section studies the influence of the two parameters. Two unimodal functions F2 and F3, two multimodal functions F7 and F8, and two hybrid composition functions F16 and F17 are used to assess the performance. The population size is N = 100, the cluster number is set to K = 2, 4, 6 or 8, and the generation number to restart clustering is set to G = 5, 10, 20, or 30. The other parameters are the same as in the previous section. Fig. 2 plots the error bars of the results obtained by CMS-CoDE with different combinations of control parameters over 30 runs on the 6 instances. On F2, it clearly shows that as K and G increase, the performance decreases. The reason is that F2 is unimodal problem and the best cluster number is 1, and k-means fails to capture the population structure with the given control parameters. On the contrary on F8, the performance increases as K and G increase. The reason is that F8 is a multimodal problem and large number of clusters may lead to better population partition. On F7, CMS-CoDE obtains very stable results and this indicates that CMS is not sensitive to the control parameters on the two problems. On F3, F16, and F17, the performance curves are not stable and the standard deviations are big  on several combinations. We can also see from Fig. 2 that the performance is more sensitive to K than G. A moderate number of clusters is suitable.

4.
Conclusions. In this paper, we proposed a strategy to integrate statistical and machine learning (SML) techniques to guide the search of evolutionary algorithms (EAs) efficiently. The idea is to combine the SML iteration and EA iteration together. The learning process and optimization process are performed alternatively.
As an example, a general clustering based mate selection (CMS) assisted EA framework was proposed. In CMS, the population is partitioned into classes and the parents in the same class are allowed to do offspring reproduction. More specifically, a CMS utilizing k-means clustering technique was designed and integrated into a state-of-the-art EA. The experimental results suggested that CMS can improve the performance of existing EAs. The time complexity analysis also showed that the proposed approach does not bring much additional cost to the EA to improve. It should be noted the current work is very preliminary and there are a variety of directions worth exploring. The combination of CMS and EAs should be improved. How to organize data for model building should be studied. Furthermore, it is worth to apply CMS strategy to multi-objective optimization problems.