Covariance and crossover matrix guided differential evolution for global numerical optimization

Differential evolution (DE) is an efficient and robust evolutionary algorithm and has wide application in various science and engineering fields. DE is sensitive to the selection of mutation and crossover strategies and their associated control parameters. However, the structure and implementation of DEs are becoming more complex because of the diverse mutation and crossover strategies that use distinct parameter settings during the different stages of the evolution. A novel strategy is used in this study to improve the crossover and mutation operations. The crossover matrix, instead of a crossover operator and its control parameter CR, is proposed to implement the function of the crossover operation. Meanwhile, Gaussian distribution centers the best individuals found in each generation based on the proposed covariance matrix, which is generated between the best individual and several better individuals. Improved mutation operator based on the crossover matrix is randomly selected to generate the trial population. This operator is used to generate high-quality solutions to improve the capability of exploitation and enhance the preference of exploration. In addition, the memory population is randomly chosen from previous generation and used to control the search direction in the novel mutation strategy. Accordingly, the diversity of the population is improved. Thus, CCDE, which is a novel efficient and simple DE variant, is presented in this paper. CCDE has been tested on 30 benchmarks and 5 real-world optimization problems from the IEEE Congress on Evolutionary Computation (CEC) 2014 and CEC 2011, respectively. Experimental and statistical results demonstrate the effectiveness of CCDE for global numerical and engineering optimization. CCDE can solve the test benchmark functions and engineering problems more successfully than the other DE variants and algorithms from CEC 2014.

Differential evolution (DE) is one of the most efficient evolutionary algorithms (EAs) and has wide application in numerous numerical optimization problems in diverse fields (dos Santos Coelho et al. 2014). ED was first introduced by Storn and Price (1995). DE is a population-based optimization algorithm similar to other EAs. This algorithm primarily consists of a mutation operator and a crossover operator (Storn and Price 1997). Each individual in the population in DE is called a target vector. First, a mutant vector is produced by the mutation operator. Then, a trial vector is confirmed by the crossover operator applied to the target and mutant vectors. Finally, the better solution is selected between the trial vector and its target vector according to their objective function values. DE has been successfully demonstrated in various continuous optimization problems in many science and engineering fields because of its simple structure, easy operation, convergence property, quality of solution, and robustness. DE has also been used in robot control (Wang and Li 2011), sensor array interrogation (Venu et al. 2008), cluster analysis (Maulik and Saha 2009), and other applications (Dong et al. 2014;Gundry et al. 2015;Zhang and Duan 2015;. DE is sensitive to the choice of the mutation and crossover operators and their two associated control parameters, namely, the crossover control parameter CR and scaling factor F (Qin et al. 2009). The influence of these factors has been paid much attention, and a series of different DEs has been proposed to improve the optimization performance. Brest et al. (2006) proposed the JDE algorithm, which is a DE with self-adaptive parameter control. In this algorithm, CR and F are encoded into the chromosome and participate in the evolution. Zhang and Sanderson (2009) improved F by Cauchy distribution and CR by normal distribution in the parameter-adaptive DE algorithm called JADE. Moreover, self-adaptive equations for CR and F have been proposed to control their values with increased generation. Qin et al. (2009) proposed another self-adaptive DE called SaDE with a strategy pool as well as different parameter settings. Mallipeddi et al. (2011) proposed the EPSDE algorithm, which is a DE with an ensemble of control parameter and mutation strategies. EPSDE has a distinct trial vector generation strategy pool and controls parameter pool to self-adjust its search strategy along with the iteration process. Wang et al. (2014) introduced the CoBiDE algorithm, which uses a covariance matrix learning strategy based on the current population distribution to initialize the population of DE and a bimodal distribution strategy to control the value of the two control parameters. These DE-based algorithms and other improved DEs have enhanced the optimization performance of DE to some extent. However, the simple structure of standard DE has been considerably changed, resulting in the apparent difficulty in balancing between exploration (searching for better individuals) and exploitation (using the existing material in the population to obtain the best effect) (Fraa et al. 2015).
Thus, we propose a covariance and crossover matrix-guided DE (CCDE) based on several studies (Ghosh et al. 2012;Santucci and Milani 2011;Zhabitsky and Zhabitskaya 2013) to solve these problems. The covariance matrix between the current best individual and several better individuals can reflect the rotation information of the function to some extent. Thus, the covariance matrix is used to guide the generation of new individuals. We introduce the Gaussian distribution that centers the best individuals found in each generation based on the proposed covariance matrix. The crossover operator and its parameter CR are simplified and replaced by the crossover matrix, which is a random binary integer-valued matrix composed of 0 and 1. In addition, the memory population M is introduced to enhance the exploration of the CCDE and is used to control the search direction of the generation. CCDE has been tested on 30 benchmarks chosen from the IEEE Congress on Evolutionary Computation (CEC) 2014 (Liang et al. 2013) and 5 real-world engineering problems selected from CEC 2011 (Das and Suganthan 2010). The performance of CCDE is compared with those of JADE, SaDE, EPSDE, and CoBiDE, as well as five algorithms from CEC 2014. The experimental and statistical results suggest that the performance of CCDE is better than those of other compared algorithms.
The rest of this paper is organized as follows. Section "DEA" introduces DE briefly. CCDE is presented in section "CCDE". The experimental results are presented in section "Experimental study". Finally, section "Conclusion" elaborates the conclusion and future work.

DEA
DE is a population-based heuristic search algorithm and has four basic processes: initialization, mutation, crossover, and selection.

Initialization
DE performs an initialization by selecting several points from the search space randomly using Eq. (1), as follows: where D denotes the dimension of the population and N denotes the population size. The vector element of x i,0 is a random number uniformly distributed in the range [low, up], where low and up are the boundaries of the search space.

Mutation
The standard mutation strategy used in DE is "DE/rand/1" and can be illustrated using Eq. (2), as follows: where F is the scaling factor varied from 0.4 to 1; and r 1 , r 2 , and r 3 are randomly chosen from [1, N]. i, r 1 , r 2 , and r 3 are mutually different. G (G = 1, 2, 3, …, Maxgen) is the current generation. Control parameter F is a random value for each individual. A larger F is effective for global search, while a smaller F is useful for local search.

Crossover
After mutation, the crossover operator is used by Eq. (3), as follows: where CR is a crossover control parameter or a factor selected from the range [0,1), i = 1, 2, …, N and j = 1, 2, …, D. j rand is an integer value randomly chosen from [1, N]. (1) The trial vector u i,G is generated in the process. CR controls the mutation probability. The larger CR inherits more elements from the mutant vector.

Selection
In the selection process, DE chooses the better one between the target vector x i,G and trial vector u i,G according to their fitness value using Eq. (4), as follows: where F(x) is the fitness value of vector x.

CCDE
CCDE is a novel DE variant designed to be a global minimizer. Unlike the standard DE, CCDE can be explained by dividing its functions into four steps: initialization, selection-I, trial population generation, and selection-II. The trial population is generated by the crossover and covariance matrices. Algorithm 1 shows the general structure of CCDE.
The detailed description of CCDE is presented as follows.

Initialization
The initialization population P 0 of CCDE is the same as those of other DEs using Eq. (1). Contrary to the other DE variants, M in CCDE is used to store the individuals of P with rearranged order. Moreover, M is used to control the search direction and thus enhance the capability of exploration. Given that P 0 is definite, M 0 is initialized by Eq. (5), as follows:

Selection-I
The fitness values of initialized population P 0 are calculated, and the best individual is stored.

Generation of the crossover matrix
This step is the most important process in the CCDE. M is adjusted prior to the generation of the trial population to store the previous generation randomly using Eq. (6), as follows: (4) where a and b are random numbers with uniform distribution in the range (0,1). Permuting is a function to change the order of individuals in M and thus improve its diversity. As a result, the population has a memory capability and is mainly used to improve the performance of exploration.
Then, the crossover matrix (Cr) is generated randomly instead of the crossover operator. This matrix is used to determine whether the individuals of P must be updated or not. Cr G is composed of the integer 0 and 1, and initialized by Cr 0 = 0 before the iteration. When Cr i,j (i = 1, 2, 3, …, N; j = 1, 2, 3, …, D) is equal to 0, x i,j,G remains unchanged.
Otherwise, x i,j,G is updated and generated using Eq. (7), as follows: where rand a and rand b are random values selected from the uniform distribution in the range (0,1). randi{D} is a function to randomly generate the integer value from 1 to D. u (i:randi{D}) represents the vector elements chosen from the vector u from the order number i to randi{D}. The elements of u are generated by permuting function about the integer numbers {1, 2, 3, … D}. In Eq. (7), when rand a is less than rand b , several vector elements of individual i is updated, while the others remain unchanged. Otherwise, only one vector element of individual i is changed.
The crossover matrix in this step is mainly used to balance the performance of the exploration and exploitation. The crossover matrix of CCDE is more complex and efficient without CR than the crossover operator of other DEs because the diversity of its population is firmly enhanced.

Generation of covariance matrix
The best individual found during evolution is used as the leader to guide the search and thus improve the capability of exploitation. The newly generated individual must center the best individuals. The region around the best individual may be considered the potential region to find the next better individual. Therefore, this method is used to generate the covariance matrix. However, considering the avoidance of local optimum and based on the covariance matrix adaptation evolution strategy (CMA-ES) in Hansen and Ostermeier (2001), covariance matrix learning in CoBiDE in Wang et al. (2014), and differential covariance matrix adaptation EA in Ghosh et al. (2012), a novel covariance matrix strategy is proposed by learning from the previous best individual and present population. With the use of this strategy, the covariance matrix inherits the information accumulated during evolution and learns new information from the present population. The covariance matrix is generated by Eq. (8), as follows: where cov(x best1,G , x best2,G , x bestλ,G, ) calculates the covariance matrix of the λ best individuals in the current generation and = ⌊N /4⌋. The covariance matrix, as indicated by CMA-ES and CoBiDE, is used to guide the generation of trial population and fully utilizes the information of the individuals to improve the convergence speed. However, contrary to CMA-ES and CoBiDE, the information of the λ best individuals is considered in the covariance matrix of CCDE.

Generation of trial population
The trial population is generated in this step. The covariance matrix is used as a guide to search the region around the best individual by Gaussian distribution and thus improve the exploitation. The exploration is enhanced using the form of "DE/rand/1" with the improved search direction confirmed by memory and target populations. As a result, we choose one of the two strategies randomly to balance the exploration and exploitation, which can be formulated as follows: where M G is a memory population and P G ' is a random-ordered individual of population where C(1, 0.1) is the Cauchy distribution with local parameter 1 and scale parameter 0.1] (Wang, Lib, and Huang 2014). X best,G is the current best population consisting of the current best individual. N(0, Co G ) is the Gaussian distribution with mean value 0 and variance value Co G . r = rand 1 1 − rand is the adaptive step size, which is similar to that in simulated annealing algorithm (Edmonds 1971). This step size gradually decreases the search range, and rand is a random value in [0, 1]. From Eq. (9), the search range around the current best individual narrowed with r tends to 0 and G tends to Maxgen to exploit the individual. Meanwhile, falling into local optimum is avoided via the improved mutation operator based on the crossover matrix using a random selection strategy as indicated by Eq. (9). Figure 1 illustrates the generation of the trial vector defined by Eq. (9).
If the kth component v i,k,G+1 of v i,k,G+1 is out of the allowed search space, then it is regenerated by Eq. (10), as follows: where low and up are the boundaries of the search space.

Pseudo code for CCDE
The pseudo code can be presented in Algorithm 2 according to the description of CCDE in the previous subsections.
CCDE has a very simple structure as indicated by Algorithm 2. Combining the crossover matrix, covariance matrix, and M can achieve a good tradeoff between exploration and exploitation.

Experimental study
We analyze the performance of our CCDE by conducting a set of experiments as well as a statistical analysis of the experimental results. We use MALLBA 2013a to develop the CCDE algorithm. Non-parametric statistical tests are used in the experimental comparisons because numerical distributions of results sometimes do not follow the conditions of normality and homoscedasticity ). Therefore, our analyses are mainly focused on the mean errors of 30 or 51 independent runs. Statistical tests are accomplished using the KEEL software, including multi-problem Wilcoxon's test and Friedman's test (Alcalá et al. 2009).
We also conduct a series of comparisons with the canonical versions of DE as well as five algorithms from CEC 2014 to clarify the competitiveness of CCDE. All experiments are performed on a computer with 2.9 GHz Intel(R) Core(TM) i5-2310 processor and 4.0 GB of RAM in Windows XP. The set of benchmarks and the parameter settings are described in detail.

Benchmark functions
A total of 30 benchmark functions developed for IEEE CEC 2014 (Liang et al. 2013) are used, as well as 5 real-world engineering optimization problems selected from IEEE CEC 2011 (Das and Suganthan 2010). The 30 benchmarks are first presented and then the 5 real-world engineering optimization problems are expressed in the following section. The 30 benchmarks can be divided into 4 classes: Each function of the above test functions has shift data. F8 and F10 are separable functions, while the rest are non-separable. Some test functions are rotated using different rotation matrices to determine the correlation among variables. The global optima of some test functions are shifted to avoid being at the center of the search space. Contrary to other test functions in previous IEEE CEC, the rotation matrix for each subcomponent is generated from standard normally distributed entries by Gram-Schmidt orthonormalization. The variables in the hybrid functions are randomly divided into subcomponents, and then different basic functions are used for different subcomponents. A local optimum with the smallest bias value is the global optimum in the composition functions, and is set to the origin as a trap for each composition function included in this benchmark suite. Table 1 shows the set of the 30 test functions, which are described in detail in Liang et al. (2013).
In this section, the mean errors and standard deviations of the function error value [f(x) − f(x′)] are calculated over 30 or 51 independent runs for each test function; x is the best solution in the population when the algorithm terminates, and x′ is the global optimal value. Multi-problem Wilcoxon's test and Friedman's test at a 0.05 significance level are performed to test the statistical significance of the experimental results among the compared algorithms. The parameter N in this section is set to 100.

Comparison with other DEs
CCDE is compared with four other DE variants, namely, JADE (Zhang and Sanderson 2009), SaDE (Qin et al. 2009), EPSDE (Mallipeddi et al. 2011), andCoBiDE (Wang et al. 2014). The covariance matrix used in CoBiDE is also based on CMA-ES, and its performance is superior to that of CMA-ES ). Thus, we only choose the CoBiDE, instead of CMA-ES, as the competitor for comparison. The parameter settings for the four algorithms are the same as those in the original papers. JADE adopts  self-adaptive parameter setting with F initial = 0.5 and CR initial = 0.9. SaDE uses the normal distribution N (0.5, 0.3) to produce F and the normal distribution N (CR m , 0.1) to adjust CR self-adaptively. EPSDE sets F = 0.9 and CR = 0.1. CoBiDE sets pb = 0.4 and ps = 0.5. In this experiment, D of the 30 test functions is set to 10, and each test function independently runs 30 times with 300,000 function evaluations (FEs) and error value Error = 10 −8 as the termination criterion.
The experimental results of CCDE and four other algorithms are summarized in Table 2. The portions in italic in Table 2 represent the best results among the algorithms in terms of the optimization of the test functions. CCDE, JADE, SaDE, and CoBiDE exhibit the best performance on the three unimodal functions F1-F3. However, the performance of EPSDE on the three functions is not better than those of the four other algorithms. For the simple multimodal functions F4-F16, CCDE exhibits the best performance on F4-F9 and F11-F14 compared with the four other algorithms. In particular, CCDE can reach the global best value on F4 and F6-F8. CoBiDE shows the best performance on F10 and F15 among all algorithms. EPSDE outperforms the four other algorithms in F16. The outstanding performance of CCDE can be attributed to its proposed strategies that can balance exploration and exploitation. The five algorithms cannot find the global best values for the hybrid functions F17-F22. However, Table 2 shows that the performance of CCDE outperforms the other algorithms on the majority of the test functions, except F18 in which CoBiDE performs better than CCDE. The results of the five algorithms for the composition functions F23-F30, which are the most difficult test functions among the 30 benchmarks, are far from the global optima. Table 2 shows that CCDE is statistically better than the other algorithms on F23-F26 and F28-F30. CoBiDE exhibits the best performance on F27.
We also perform the multi-problem Wilcoxon's test, which is accomplished using the KEEL software, to check the behavior of the algorithms (Alcalá et al. 2009) Tables 3 and 4 represent the best results among the algorithms in terms of the optimization of the test functions. Table 3 shows that CCDE provides higher R+ values than R− values in all cases. Wilcoxon's test at α = 0.05 shows significant differences among CCDE and the competitors. This result indicates that CCDE is significantly better than JADE, SaDE, EPSDE, and CoBiDE on the 30 test functions at α = 0.05.

. Tables 3 and 4 summarize the results of the Wilcoxon's and Friedman's tests. The portions in italic in
Friedman's test based on the KEEL software is performed to further detect the significant difference among CCDE and the four compared algorithms (Alcalá et al. 2009). Iman-Davenport's procedure is used as the post hoc procedure. Table 4 summarizes the ranking results of the five algorithms obtained by Friedman's test. CCDE ranks comparable with JADE, SaDE, and CoBiDE on the unimodal functions and ranks best on the multimodal, hybrid, and composition functions. Thus, CCDE ranks the best on the 30 benchmarks of 10 dimensions compared with JADE, SaDE, EPSDE, and CoBiDE. Figures 2 and 3 illustrate the mean function error values for the 5 algorithms with 30 independent runs for the 24 typical benchmark functions. Figure 2 shows that CCDE can provide better convergence trends for F1, F4-F9, and F11-F12 than the other algorithms. JADE shows the best convergence trends for F2 and F3. CoBiDE presents the best convergence trends for F10. Figure 3 shows that CCDE performs better than the other algorithms on the convergence trends for F13-F15, F20-F22, F25, F27, and F30.

Comparision with CEC 2014 algorithms
CCDE is compared with five algorithms from CEC 2014 in terms of the single-objective real-parameter numerical optimization. These algorithms are all participant algorithms in such special session. They consist of modern real-coded optimizers, hybridizing with local search or using convergence matrix methods. Some of these algorithms follow evolutionary computation or swarm intelligence variants. The five algorithms are convergence matrix learning and search preference algorithm (CMLSP) (Chen et al. 2014); non-uniform mapping in real-coded genetic algorithm (NRGA) (Yashesh et al. 2014); simultaneous optimistic optimization (SOO) (Preux et al. 2014); fireworks algorithm with DE (FWA-DE) (Yu et al. 2014); and OptBees, which is inspired by the collective decision-making of bee colonies (Maia et al. 2014). The experimental results of the compared algorithms are directly taken from (Chen et al. 2014;Yashesh et al. 2014;Preux et al. 2014;Yu et al. 2014;Maia et al. 2014) to ensure fair comparison. In this experiment, D of the 30 test functions is set to 30. Each test function independently runs 51 times with 300,000 FEs and error value Error = 10 −8 as the termination criterion for fair comparison. The parameter N in CCDE is set to 100. Table 5 summarizes the experimental results among CCDE and other algorithms in terms of mean errors and standard deviations of 51 independent runs. The portions in italic in Table 5 represent the best results among the algorithms in terms of the optimization of the test functions. CCDE performs better for the majority of the test functions than the five other algorithms.
Wilcoxon's and Friedman's tests are performed to further detect significant differences among CCDE and the five competitors (Alcalá et al. 2009). Tables 6 and 7 summarize the  results of these tests. The portions in italic in Tables 6 and 7 represent the best results among the algorithms in terms of the optimization of the test functions.
The R+ values in Table 6 show that CCDE has better statistical performance than CMLSP, NRGA, SOO, FWA-DE, and OptBees. Wilcoxon's test at α = 0.05 show significant differences among CCDE and the competitors, except for CMLSP. Table 7 shows that CCDE and CMLSP rank the best for the unimodal functions with 30 dimension variables. CCDE ranks the best for the multimodal, hybrid, and composition functions. Thus, CCDE ranks first on the 30 test functions. Figure 4 illustrates the trace progress for typical test functions with 30 dimension variables.

Real-world application problems
In addition to the 30 benchmarks in the previous sections, 5 real-world engineering optimization problems from IEEE CEC2011 are selected to evaluate the performance of CCDE in this subsection. These five real-world engineering optimization problems (denoted as RP 1 -RP 5 ) are the parameter estimation for frequency-modulated sound waves (T01 in CEC    2.00E+02 ± 0.00E+00 problems can be found in (Das and Suganthan 2010). The parameters of CCDE and other compared DEs are the same with those for the 30 benchmarks. A total of 30 independent runs are performed for each problem, with 150,000 FEs as the termination criterion. Table 8 summarizes the means and standard deviations of the objective function values over 30 independent runs for each problem. Wilcoxon's and Friedman's tests at a 0.05 significance level are implemented on the experimental results using KEEL software to draw statistically sound conclusions (Alcalá et al. 2009). Table 9 shows that CCDE has higher R+ values than the other algorithms in all problems. Moreover, p values are less than 0.5 in all cases, except for CCDE versus CoBiDE. In addition, CCDE has the best ranking according to Table 10. The portions in italic in Tables 8, 9 and 10 represent the best results among the algorithms in terms of the optimization of the test functions.
Therefore, these experimental results verify the potential of CCDE in real-world applications.

Conclusions
The number of works in evolutionary computation involving the solution of difficult optimization problems has been increasing in recent years. DE is an efficient and robust EA and is a hotspot in this field. CCDE, a DE variant based on strategies guided by the crossover and covariance matrices, is proposed in this paper to improve the performance of DE and simplify its structure.
In CCDE, the classical crossover operation and its associated CR in DE is simplified by the crossover matrix, which is a binary integer-valued (0, 1) matrix of size N × D computed by the random generation equation. Improvement is performed to enhance the exploration capability by increasing the diversity of the population. The covariance matrix generated by the λ best individuals is used to fully utilize the information for the best individuals and randomly search the region around the best individual by Gaussian  distribution. Accordingly, the exploitation capability is improved. In addition, M is introduced to store the previous generation and control the search direction. As a result, the diversity of the population is enhanced. CCDE has been tested on 30 benchmark test functions developed for IEEE CEC 2014 and 5 complex real-world engineering optimization problems selected from IEEE CEC 2011. The experimental and statistical results suggest that the performance of CCDE is better than those of the four other DE variants and five algorithms from CEC 2014. CCDE shows high-quality solution and robustness for the tested benchmark functions and real-world engineering problems. Future studies can extend CCDE by applying the algorithm to various classes of problems, such as multi-objective optimization and constrained optimization problems. The  Table 8 Mean and SD obtained by JADE, SaDE, EPSDE, CoBiDE and CCDE through 30 independent runs on 5 engineering optimization problems with 150,000 FES with 150,000 FEs method of CCDE and overall comparison with other evolution algorithms can also be comprehensively studied.