SPEEDING UP A MEMETIC ALGORITHM FOR THE MAX-BISECTION PROBLEM

. Max-bisection of a weighted graph G = ( V,E ) is to partition the vertex set V into two subsets that have equal cardinality, such that the sum of the weights of the edges connecting vertices in diﬀerent subsets is maximized. It is an NP-complete problem. Various heuristics have been proposed for solving the max-bisection problem and some of them get very good quality solutions, but are time consuming. In this paper, we propose several techniques to speeding up Lin and Zhu’s memetic algorithm for this problem. It consists of a crossover operator with a greedy strategy, a fast modiﬁed Kernighan-Lin local search, and a new population updating function. Experiments are performed on 71 well-known G-set benchmark instances. Results show that the memetic algorithm is much faster than Wu and Hao’s memetic algorithm, which can get best solutions up to now, on middle-scale and large-scale instances. Speciﬁcally, some current best known solutions of the test instances are improved.


1.
Introduction. Let G = (V, E, W ) be a weighted undirected graph, where V = {v 1 , v 2 , · · · , v n } is a set of n vertices, E = {e 1 , e 2 , · · · , e m } is a set of m edges, W = {w 1 , w 2 , · · · , w m } is a set of m integers where each element w i stands for the weight on the corresponding edge e i . For an unweighted graph, we can consider it as a weighted graph with the weight on each edge unitary. The max-bisection problem (MBP) consists in partitioning the vertex set V into two disjoint subsets, A ⊂ V and B = V \ A with their sizes differing by at most one, meanwhile maximizing the sum of the weights of the edges connecting vertices in different subsets.
MBP is a known NP-complete problem [32], and has a large number of applications in various areas, such as sparse matrix computation [1], scientific computing [18,38], and VLSI design [3,6,27]. A wide variety of algorithms have been proposed for MBP because of the wide practical applicability. Some of the exact algorithms are based on the polyhedral cut and price approach [26], or the branch-and-cut method [5]. These exact algorithms are time-consuming and cannot be applied to practical large graph with more than a few hundred vertices even though they return optimal solutions.
Currently, there are a number of approximation algorithms for finding approximate optimal solutions of MBP. Semidefinite programming (SDP) has been used with great success in designing approximation algorithms for MBP since proposed by Goemans and Williamson [14] in 1995. Frieze and Jerrum [13] extended this algorithm and obtained an approximation algorithm with approximation factor 0.651. This was improved to 0.699 by Ye [43] in 2001. By strengthening the semidefinite programming relaxation with triangle inequalities, Halperin and Zwich [15] proposed a 0.7016-approximation algorithm for MBP. More recently, Raghavendra and Tan [35] gave a 0.85-approximation algorithm by using semidefinite programming hierarchies, and it was further improved to be a 0.8776-approximation by Austrin et al. [2]. In a celebrated result, Hastad [16] proved that there does not exist an approximation algorithm for MBP with approximation ratio greater than 16 17 unless P = NP. Khot et al. [24] showed that MBP is hard to be approximated above α GW + ≈ 0.8786 under the Unique Games Conjecture (UGC), which means that the algorithm in literature [2] is nearly optimal if the UGC holds. For some special graphs, e.g., planar graphs [20], and regular graphs [11,22], there are polynomial time approximation schemes (PTAS).
The results mentioned above are of great theoretical interest. However, due to solving semidefinite programming problems, these approximation algorithms are still hard to deal with large-scale instances for computational challenge. For larger instances, researchers often have the aid of heuristics or metaheuristics to find some solutions with good quality in a reasonable running time, although there is no performance guarantee. Xu et al. [42] proposed a lagrangian net algorithm for MBP, which relaxes the bisection constraint to the objective function using the penalty function method, and solves the relaxed problem by a discrete hopfield neural network iteratively. Ling et al. [29] described a modified variable neighborhood search metaheuristic for the max-bisection problem. A deterministic annealing algorithm was presented in literature [9].
Memetic algorithms have also been used to solve the max-bisection problem with promising experimental results [28,41]. Lin and Zhu [28] presented a memetic algorithm, which consists of a fast local search procedure, a fixed crossover operator, and a population updating strategy. Numerical results and comparisons on some well known test problems showed that their proposed memetic algorithm was effective. However, their crossover operator randomly partitions the unfixed vertices into two subsets that have equal cardinality, while neglecting the structural information of fixed vertices. It makes the algorithm hard to generate high quality offsprings, and is difficult to find better solutions quickly. Wu et al. [41] presented an efficient memetic algorithm for solving the max-bisection problem, by integrating a grouping crossover operator and a perturbation based tabu search procedure. Computational results on some standard test problems indicated that their proposed memetic algorithm improved the current best known solutions on many tested instances. However, it is very time-consuming, especially for large-scale instances.
The above memetic algorithms can get very good quality solutions of MBP, but both of them suffer from long solution times, especially for large-scale instances. Hence it is worth to develop a memetic algorithm to find near optimal solutions quickly for large scale instances. In this paper, we focus on speeding up the memetic algorithm in [28] for MBP to get competitive solutions. The proposed algorithm consists of three components. Firstly, a crossover operator is introduced to generate an offspring. The crossover operator inherits the common structural information that the parents shared, and uses a new greedy method to partition the vertices that the parents unshared. This strategy helps to generate high quality offsprings. Then a simple perturbation is introduced to diversify the search with a probability. And an efficient local search procedure is presented to refine the offspring obtained by the crossover operator. At last, the offspring is determined whether or not it is used to update the population by considering both quality and diversity of solutions. Extensive experiments have been done on a number of benchmark instances from the literature. For the large graphs with the number of vertices more than 8000, our memetic algorithm is capable of detecting the solutions whose best and average cut values are not worse than Wu and Hao's [41] in much less running times.
The main contributions of this work are as follows: (1) A new crossover operator of MBP is proposed, which not only inherits the common structural information of the parents, but also uses a greedy method to generate information of the uncommon part for an offspring. This method generates good quality offsprings.
(2) The Kernighan-Lin local search procedure [23] is modified by moving one vertex from one part of the partition to another one alternatively, and stopping every pass early. Moreover, the bucket data structure is used for the local search algorithm. The two strategies speeding up the algorithm effectively.
(3) Different from [28], a new updating function is used for updating the population. The new function controls the population of individuals effectively for MBP.
The paper is organized as follows. In Section 2, the framework of the proposed memetic algorithm is given, and then the components of the memetic algorithm are described in detail. Extensive computational results and comparisons are presented in Section 3. Finally, conclusions are given in Section 4.

Problem formulation, solution representation and some definitions.
Recall that MBP requires an algorithm to partition the vertex set V into two subsets A and B with equal cardinality (bisection), i.e., ||A| − |B|| ≤ 1, such that the sum of weights of the cross edges is maximized. Without loss of generality, assume that the number of vertices is even, otherwise we can add an isolated vertex that there is no edge connected to it. Thus a feasible solution (A, B) of MBP satisfies that |A| = |B|.
A feasible solution (A, B) of MBP is coded as a string, with a one-to-one correspondence between bits and vertices. For some string Thus we can convert easily between a string and a bisection. The MBP can be formulated as the following constrained binary quadratic programming problem, In this binary formulation, the objective function f (x) can be written as f (x) = where D is a diagonal matrix whose diagonal element d ii is the sum of weights of all the edges incident to the vertex v i , i = 1, 2, · · · , n, A is the adjacency matrix, and L = D − A is the Laplacian matrix of graph G [8,39,40] (the weights are generally required to be nonnegative). The values of x i are binary so as to assign v i to either one of the two subsets. The first constraint ensures that the corresponding partition is balanced. Clearly,x = (−x 1 , −x 2 , · · · , −x n ) is exactly the same bisection as x. Definition 1. [28] For a feasible solution x = (x 1 , x 2 , · · · , x n ), its symmetric solutionx is define asx = (−x 1 , −x 2 , · · · , −x n ).

Definition 2. [28]
Given two bisections (A x , B x ) and (A y , B y ), assume that the corresponding solutions are x and y respectively. In order to measure the difference between (A x , B x ) and (A y , B y ), the distance between x and y, denoted by d(x, y), is defined as the least number of changing one bit value for transforming (A x , B x ) to (A y , B y ).

Definition 3. [28]
Given a population of solutions P = {x 1 , x 2 , · · · , x p }, the distance between P and some solution x i in P , denoted by d(x i , P ), is defined as: 2.2. Framework of the memetic algorithm. Memetic algorithm (MA) is an efficient approach for hard combinatorial optimization problems [31,33]. MA is based on Genetic Algorithm (GA) [19] which mimics the process of natural selection. It starts from an initial population and explores the solution space using a crossover operator to generate new individuals, and a local search procedure to improve the newly generated solutions repeatedly. The success of MA is mostly often determined by the crossover operator, which must be adapted to the problem and should be able to inherit meaningful features from parents to offsprings. The Local Search procedure is applied to refine the new solutions quickly and effectively. The quality of MA is influenced by the quality of individuals as well as the diversification of population.
The general structure of the memetic approach for MBP is presented in Algorithm 1. In the algorithm at line 1, each individual in the initial population P = {x 1 , x 2 , · · · , x p } is generated by a randomized algorithm (Algorithm S : Selection sampling technique) on page 142 of Knuth's book [25], and then refined by the Local Search procedure. According to the Algorithm S of [25], the current i th vertex v i is added into subset A if (n − i + 1)r < n 2 − s, otherwise B, where n is the number of total vertices, s is the number of vertices that they are selected and added into A so far, and r ∈ [0, 1) is a uniformly distributed random number. At line 2, the solution with the maximum cut value is chosen as the current best solution x * in P . And then repeat the process of lines 4-18 in Algorithm 1 for M ax Generation (a threshold defined by users) times at most. The iteration should also be terminated if it fails to find a better solution for consecutive M ax N oU pdate (another threshold defined by users) times.
At each generation, two solutions are selected randomly as parents for generating a new offspring x 0 by performing the crossover operator (lines 5-6). After several iterations, the solutions in a population become remarkably similar, and the solutions obtained by the crossover operator and the Local Search procedure are prone to local optimum. Hence similar to literature [41], the solution obtained by the crossover operator will be perturbed with some probability p α between lines 7-9. And then the Local Search procedure is used to refine the solution x 0 (line 10), where the refined solution is still defined as x 0 . Some appropriate actions are performed depending on the cut value of the refined offspring x 0 (lines [11][12][13][14][15][16]. If x 0 is better than the current best solution x * , then x * is updated as x 0 . At last, the strategy P ool U pdating is applied to determine whether or not the refined offspring is inserted into the population P (line 17). In the following subsections, we will describe components of our MA in more detail.

Algorithm 1 The structure of MA
Input: An undirected weighted graph G. Output: The found best solution x * and f (x * ).

Crossover operator.
A feasible solution for MBP is a partition of the vertex set V into two subsets with equal cardinality, i.e., there must be the same number of 1s as −1s in the corresponding string. Our MA uses the fixed crossover operator [7,10], in which the vertices with label 1 are as many as the vertices with label −1 in two bisections. Firstly, we present the following results.
Since MA mimics the process of natural selection, an offspring needs to inherit the good common structural information shared by its parents x and y. By Theorems 2.1 and 2.3, we know that the number of common vertices is not less than n 2 (we can transform y to its symmetric solutionŷ if necessary). Actually the number of inherited vertices equals to n − d(x, y) by Theorem 2.3.
By the above theorems, we present the crossover operator in Algorithm 2. At the beginning of the crossover operator, two different individuals x f and x m are selected as parents from the population P at random. Maybe we need to exchange the two i as 0 (lines 5-11). At line 12, the variable n f ree is used to count the number of components in x 0 with label 0. We know that n f ree = d(x f , x m ) for the partial solution x 0 by Theorem 2.3. Furthermore, the partial solution x 0 contains components labeled with −1 as many as labeled with 1 by Lemma 2.2, i.e., the partial partition is also balanced. For the components labeled with 0 in x 0 , inspired by the Differential-Greedy algorithm in Battiti and Bertossi [4], the final solution is obtained by fixing one component to 1 and another one to −1 alternately in a greedy manner (lines 13-19) as follows.
For every vertex v i labeled with 0, define its gain g i as the sum of weights of the edges adjacent to the vertices labeled with −1 minus the sum of weights of the edges adjacent to the vertices labeled with 1, i.e., g i = {k: For the purpose of increasing the cut value, we expect to choose a vertex labeled with 0 to fix its value leading to a bigger sum of weights of the cut edges. This is done in a greedy manner.
Between lines 13 and 19, the algorithm selects a vertex v i whose gain is maximum and a vertex v j whose gain is minimum among vertices labeled with 0. After that, set x i = 1 and x j = −1. If several vertices have the same maximum gain, then one of them is selected at random. The final bisection is obtained after d(x f ,x m ) 2 stages. The final partition must be balanced due to the initial bisection and the alternative assignment. For ease of selecting and updating, we may use a bucket array structure and updating strategy presented in Subsection 2.4. Finally, it is easy to see that the computational complexity of the algorithm between lines 13 and 19 is O(m) [4].
The algorithm between lines 13 and 19 is different from the Differential-Greedy algorithm [4]. The Differential-Greedy algorithm [4] starts from a partial bisection that there is only one vertex in each subset, and then two vertices are selected randomly. However, the algorithm between lines 13 and 19 constructs a solution from a partial solution that at leat n 2 vertices are fixed, and two vertices are selected in a greedy manner.
2.4. Local search. Kernighan − Lin algorithm (K-L) [23] is one of the most famous heuristics for graph bisection. K-L could escape from local optima by accepting nonimproving swaps. It consists of passes, and starts from some bisection (A, B). At the beginning of each pass, all vertices are unlocked (i.e., free to be swapped). The dif f erece value (or D value) of a vertex v i , D i , is the cut value increment obtained by moving v i from its current partition to the other partition. Mathematically, Theorem 2.4.
[28] Given a bisection (A, B) of G, suppose that the vertex v i is removed from its current subset and then inserted into the other. Then the new D value D j of vertex v j can be worked out quickly as follows: We denote the gain g ab as the cut value increment obtained by swapping vertices v a and v b , which can be computed as The K-L chooses a pair of unlocked vertices (v a , v b ) ∈ A × B with the highest gain g ab to swap and then lock repeatedly till all vertices are locked. A pass stops after swapping n 2 pairs of vertices. The K-L algorithm will terminate if it fails to find a better bisection in the current pass; otherwise a new pass will start from the best bisection found. Obviously, the K-L algorithm takes O(n 3 ) time per pass.
F iduccia − M atheyses algorithm (F-M) [12] was used in [28] for max-bisection of graph, which requires the time complexity O(m) per pass by transferring only one unlocked vertex from its current partition to the other partition. To keep the partition be balanced, the F-M selects a vertex with the maximum D value from the two partitions alternatively. To reduce the time complexity of K-L, based on the idea of the F-M, we modify the K-L, and provide a local search procedure to iteratively improve the solutions of max-bisection. The modified K-L also consists of a series of passes as described in Algorithm 3.
Each pass is described between lines 3-27. At the beginning of a pass, all vertices are initialized to be unlocked. x new is the best solution found in a pass (lines 4-6), and is initialized to x old . An unlocked vertex v i with the highest D value is selected from the subset F A , in which vertices are labeled with -1, and moved to the other subset F B , i.e., v i is deleted from F A (lines [11][12][13]. Then, v i is locked, and the D values of the adjacent vertices of v i are updated by Theorem 2.4. A similar operation is carried out on F B . An unlocked vertex with the highest D value in F B , say v j , is moved and locked. After v i and v j are swapped, a new bisection is found. The variable gain sum is used to record the increment of cut value comparing with x old . If gain sum > 0, which indicates that a better bisection is found, then the best solution x new of the current pass is updated. The proposed local search procedure needs O(n) time to select a pair of vertices to swap, and the time complexity per pass is O(n 2 ). Hence the time complexity of the proposed local search procedure is much lower than that of K-L.
In every pass of K-L, n 2 pairs of vertices are selected and swapped. Empirically, the main part of improvement during a pass is in initial several swaps. Hence, in Algorithm 3, we stop a pass after nγ pairs of vertices are swapped, where γ is a parameter satisfying 0 < γ < 1 2 . If a pass fails to find a better solution (i.e., f lag = 0), then the local search procedure terminates. Otherwise, a new pass is x old ← x new .

15:
Update D values of the adjacent vertices of v i by Theorem 2.4.

16:
Select a vertex v j from F B with maximum D value. 17: x old j ← −1.

20:
Update D values of the adjacent vertices of v j by Theorem 2.4.

24:
gain sum ← 0. started from the current best bisection x new of the previous pass. In this way we can speeding up the K-L algorithm for large scale instances.
Another strategy to speeding up the K-L algorithm is using a special data structure, i.e., gain bucket, which is used for selecting a vertex and updating the D values of its adjacent vertices quickly. This data structure was originally proposed in [12] for hypergraph partitioning. The bucket data structure maintains two arrays of gain buckets. One is for a part of a partition (A, B), i.e., A, and another one is for B. An array of gain bucket keeps all possible gains of vertices of the corresponding part, which must be sorted. Each bucket is a doubly linked list keeping the vertices in the part with the same gain, which may be empty. Moreover, an additional array of all vertices of the graph is constructed, where each element points to its corresponding vertex in the doubly linked lists.
At the beginning, the vertices with the same D value d are put in a bucket that is ranked d. Note that the vertices in different parts of the partition are put in different buckets. We use two special pointers to point to the highest indices of the two arrays whose buckets are not empty, respectively. In this way, we can select the vertex with the highest D value in a part of a partition in constant time. After a vertex is moved, we update the D values of affected vertices according to the Theorem 2.4, and transfer these vertices to appropriate buckets. Since delete and insert operations in the bucket are both of O(1) complexity, it takes constant time to transfer these vertices to appropriate buckets. Therefore, the total running time of moving one vertex from one part of the partition to another one is bounded by constant time.
2.5. Pool updating. After a new solution is created by the crossover operator and refined by the local search procedure, we should consider whether or not to insert the created solution into the population P . In this paper, this is judged by the factors of intensification and diversification of the population. Obviously, both of the factors influence each other. On the one hand, in order to pass on good genes to an offspring, a solution with large cut value should be attached a high priority to be inserted into the population; on the other hand, it is necessary to preserve diversity of the population, such that the algorithm can search the solution space in a long term and find a better solution.
We use the distance d(x i , P ) between a solution x i and the population P , which is defined in (2), to evaluate how much a solution x i diversifies P . Taking care of the need of the quality of a solution and diversity of the population, we take both into account the objective values f (x i ) as well as the distance d(x i , P ) [28,41]. In general, d(x i , P ) is typically in proportional to the total number of vertices, but the difference between x i and the best solution x * in cut value is not significant. Inspired by literature [37], the following function h(x) is used to measure the score of a solution x, where n is the number of vertices, x * is the best solution found till the current iteration, and λ is a parameter striking a balance between the solution quality and diversity of the population: The former part of (5) is a normalized distance. The latter part of (5) is the expression of the quality of x without other operations, since the gap between f (x i ) and f (x * ) is relatively small. It takes O(n) time to compute the distance d(x i , x j ) for any two individuals x i and x j . Totally, it takes O(pn+p) to calculate d(x i , P ) for evaluating the contribution of x i to diversity of the population P . A new offspring is determined whether or not to be inserted into the population and which individual in the population is replaced according to their h(·) function values. Our updating function defined in (5) is different from the function used in [28]. Since the value d(x,P ) n changes a little on different instances, while the objective value changes quite different on different instances, our updating function uses λ · (f (x) − f (x * )) instead of f (x) − f (x * ). This balances the solution quality and diversity of the population more effectively.
The scheme of our Pool Updating is illustrated in Algorithm 4. A temporary population P is constructed by inserting the offspring x 0 into P . The scores of all the individuals are calculated in lines 2-4. The algorithm finds x min i with the minimum score among primary population P . If its score is not better than x 0 , then x 0 is inserted into P and x min i is deleted from P ; otherwise, x 0 replaces x min i with probability p r . Pool Updating achieves a dynamical control of the population P by using a quality-and-distance scoring function, which is defined in (5), in the course of proceeding.

4:
Calculate score h(x i ) by Equation (5). 5: end for 6: min i ← argmin{h(x i ) | i = 1, 2, · · · , p}. 2.6. Perturbation. After the MA repeats the crossover, the local search procedure, and the pool updating for a number of generations, it gets stuck in some local optimums and the corresponding bisections are very analogous. It is difficult to find some better bisections if it just continues to process the three operations.
In order to search a new part of the solution space, after the iteration process is repeated for perturb start generations, the MA uses the crossover operation to generate a new solution x 0 , and then the following simple perturbation is applied to x 0 to diversify the search with a probability p α . The perturbation randomly selects nα vertices from each subset, respectively, where α is a parameter indicating the strength of the perturbation, and then swaps them. We set the parameters α = 1 10 , perturb start = max{1000, n 10 }, and p α = 0.2.
3. Experimental results. For examining the efficiency of our approach, the experiments will be presented in this section. The algorithm was programmed in C and run on a PC with an Intel(R) Core (TM)2 Duo CPU E7500 (2.93 GHz) and 2.0G of RAM under Operating System Windows XP. In Subsection 3.1, we introduce the benchmarks used in the experiments. Then in order to set the parameters of the algorithm, the preliminary experiments are described in Subsection 3.2. Finally, we report in Subsection 3.3 the results obtained by our MA on the benchmarks and perform several comparisons with Wu and Hao's algorithm [41] on these instances, which is the up-to-date memetic algorithm with the best test results.
3.1. Benchmarks. The benchmarks are the G-set instances [17], which contain 71 instances. They were obtained by running a generator called "rudy" written by S. E. Karisch. The number of vertices of these benchmarks ranges from 800 to 20000, and the edge density varies from 0.17% to 6.12%. G-set instances include toroidal, planar, and randomly generated weighted graphs, which have been used to evaluate algorithms for the max-cut and max-bisection problems [28,30,34,36,41,43].    In Algorithm 3, the parameter γ reflects the percentage of the number of vertices to be moved in every pass (see lines 10-26 in Algorithm 3). According to our empirical experience, the local search procedure often improves a solution during the first small percentage of vertex moving operations in a pass. And if the local search procedure tries to move too many vertices per pass, then the procedure is easy to get stuck at local optima. Hence, the value of γ cannot be set too small or too large. Accordingly, it is necessary to determine a suitable value of γ. We test γ with values { 1 24 , 2 24 , 3 24 , 4 24 , 5 24 , 6 24 , 7 24 , 8 24 , 9 24 , 10 24 , 11 24 , 12 24 } by running the local search procedure to refine the solutions generated by Algorithm S.  the averaged cut value obtained, and t avg denotes the averaged CPU time (in seconds) used. Fig. 1 plots the averaged cut value obtained and the averaged CPU time (in seconds) used on instance G 51 for each value of γ. For most graphs in Table 1, the averaged cut value f avg increases with γ till it reaches a value, and then hovers around some value. However, the averaged running time quickly falls to a smaller value and then rises gradually. For example, on instance G 51 , f avg hovers around 3778 after γ reaches 3 24 , and at the same time, the averaged running time reaches the minimum at γ = 4 24 . Nevertheless, the averaged cut value of G 63 is poorer than the maximum when γ ≥ 6 24 or γ ≤ 3 24 . On instance G 65 , when γ ∈ { 2 24 , 11 24 , 12 24 }, the local search procedure requires much more running time than other values. Hence, if the value of γ is set too small, the local search procedure needs more running time and could not find a good solution. This is because the local search procedure does not search in a large neighborhood and requires more pass to terminate the procedure sometimes. Conversely, if the value of γ is set too large, it probably could not return a satisfactory solution because it falls into one of the deep local optima. Hence the value of γ needs to be set properly. From Fig. 1, we find that γ = 4 24 = 1 6 is an appropriate value for the instance G 51 . And at this value, from Table 1, we can see that most of the test instances can obtain good solutions in a reasonable running time.
The parameter p controls the size of the population P . Generally, if the diversity of a population is established, a bigger size of the population will be the better for the solution. But MA would be time-consuming if p is too large. The parameter λ in (5) is used to maintain a balance between the solution quality and the diversity of a population. A larger λ emphasizes more on the solution quality. To analyze empirically how p and λ influence the property of our MA, we design the following experiment. We fix γ = 1 6 , updating probability p r = 0.2, the maximum number of crossover M ax Generation = 10000, and allowable number of continuous crossover without finding a better solution M ax N oU pdate = 2000. We test our MA 20 times on instance G 65 for some value of p in the range [30,100] and λ ∈ {0.001, 0.003, 0.005, 0.008, 0.01, 0.015}, respectively. The test results are put in Table 2, in which f best denotes the best cut value among the 20 runs, and f avg is the averaged cut value over 20 runs. Also Table 3 shows the best and averaged results over the 20 runs for different M ax N oU pdate values.  From the above two tables, we can observe that the best results are reached at the combination (p = 80, λ = 0.003, M ax N oU pdate = 2000).

3.3.
Performance comparison with a current best algorithm. The memetic algorithm proposed by Wu and Hao (WHMA) [41] was published recently and obtained the solutions for the max-bisection instances. In this section, we compare the performance of our memetic algorithm with it on the G − set test instances. WHMA was carried out on a PC with an Intel Xeon E5440 processor (2.83 GHz and 8G RAM) under Operating System Windows XP. This computer is about 1.035 times slower than the one used in our experiments, in virtue of Standard Performance Evaluation Cooperation (SPEC) 2 . All parameters of our MA are set to the values in Table 4. The parameter M ax N oU pdate is increased to 3000 for the sake of better performance. And our MA is stopped if it has iterated M ax Generation = 10000 generations. Under these settings of parameters, our MA is run 20 times independently on each benchmark. The obtained computational results are put in Table 5.  1 10 In Table 5, the columns 'N ame' and '|V |' present the name and the number of vertices of each test instance, respectively. The columns 'MA' and 'WHMA' present the statistical results over the 20 runs of our memetic algorithm, and the statistical results of the memetic algorithm by Wu and Hao [41], respectively. In the table, the subcolumns 'f best ' present the best cut values found among the runs for each benchmark; the subcolumns 'f avg ' present the cut values averaged over the runs; the subcolumn 'hit' presents the number of runs obtaining f best among the 20 runs of MA; the subcolumns 't opt ' present the averaged CPU times (in seconds) for hitting f best of the two algorithms, respectively; the subcolumn 't end ' presents the averaged CPU times of our MA over the 20 runs, for each benchmark. It must be remarked that the times in subcolumn 't opt ' of 'WHMA' are normalized, which are obtained by dividing the corresponding times in [41] by 1.035. Finally, the last column '∆' shows the difference of the best cut values between MA and WHMA. From Table 5, one can observe that our MA could find the same best cut values as WHMA on 40 benchmark instances, and our MA is capable of finding solutions better than those of WHMA on 6 instances. But our MA obtains solutions worse than those of WHMA on 25 instances. Specifically, for small size test instances (the number of vertices less than 1000), our MA is able to find solutions with best cut values equal to WHMA's except instances G 13 and G 19 . And for the middle size instances (with 1000 to 5000 vertices), the best cut values obtained by our MA are slightly smaller than those of WHMA. But our MA improves the best known solutions on 5 instances out of 11 large size graphs (with up to 7000 to 20000 vertices). These observations show that, comparing with WHMA, our MA is able to achieve competitive results in terms of solution quality.
As for running time, one can observe that our MA uses less running times on 57 of the 71 test instances. Particulary, our MA is 20 to 130 times faster than WHMA for the instances with at least 5000 vertices. Especially, with the increasing of the instance size, the running time of our MA increases slowly.
The experimental results indicate that our MA can find competitive results quickly, and is capable of solving large scale max-bisection problem efficiently. 4. Conclusion. In this paper, we have presented an effective and efficient memetic algorithm MA for the max-bisection problem. The algorithm consists of a crossover operator based on a Differential-Greedy algorithm, a fast modified Kernighan-Lin local search, and a new pool updating strategy integrated with solution quality and population diversity. The experiments have been performed on the set of 71 wellknown G-set graphs. The results demonstrate that our MA performs as well as Wu and Hao's memetic algorithm (the state-of-the-art method for the max-bisection problem) in terms of solution quality in much less running time on most of the benchmarks. Especially, our memetic algorithm is suitable for dealing with largescale graphs.