# American Institute of Mathematical Sciences

May  2018, 1(2): 181-200. doi: 10.3934/mfc.2018009

## Hybrid binary dragonfly enhanced particle swarm optimization algorithm for solving feature selection problems

 1 Department of Mathematics and Statistics, Faculty of Science, Thompson Rivers University, Kamloops, BC, V2C 0C8, Canada 2 Department of Mathematics and Computer Science, Faculty of Science, Alexandria University, Moharam Bey 21511, Alexandria, Egyp 3 Electrical and Computer Engineering, The University of British Columbia, Vancouver BC V6T 1Z4, Canada

* Corresponding author: Mohamed A. Tawhid

Received  November 2017 Revised  January 2018 Published  May 2018

Fund Project: We are grateful to the anonymous 4 reviewers for constructive feedback and insightful suggestions which greatly improved this article. This research was supported partially by Mitacs Canada. The research of the 1st author is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC).

In this paper, we present a new hybrid binary version of dragonfly and enhanced particle swarm optimization algorithm in order to solve feature selection problems. The proposed algorithm is called Hybrid Binary Dragonfly Enhanced Particle Swarm Optimization Algorithm(HBDESPO). In the proposed HBDESPO algorithm, we combine the dragonfly algorithm with its ability to encourage diverse solutions with its formation of static swarms and the enhanced version of the particle swarm optimization exploiting the data with its ability to converge to the best global solution in the search space. In order to investigate the general performance of the proposed HBDESPO algorithm, the proposed algorithm is compared with the original optimizers and other optimizers that have been used for feature selection in the past. Further, we use a set of assessment indicators to evaluate and compare the different optimizers over 20 standard data sets obtained from the UCI repository. Results prove the ability of the proposed HBDESPO algorithm to search the feature space for optimal feature combinations.

Citation: Mohamed A. Tawhid, Kevin B. Dsouza. Hybrid binary dragonfly enhanced particle swarm optimization algorithm for solving feature selection problems. Mathematical Foundations of Computing, 2018, 1 (2) : 181-200. doi: 10.3934/mfc.2018009
##### References:
 [1] D. K. Agrafiotis and W. Cedeno, Feature selection for structure-activity correlation using binary particle swarms, Journal of Medicinal Chemistry, 45 (2002), 1098-1107.   Google Scholar [2] H. Banati and M. Bajaj, Fire fly based feature selection approach, IJCSI International Journal of Computer Science Issues, 8 (2011). Google Scholar [3] D. Bell and H. Wang, A formalism for relevance and its application in feature subset selection, Mach. Learn., 41 (2000), 175-195.   Google Scholar [4] B. Xue, M. Zhang, W. Browne and X. Yao, A survey on evolutionary computation approaches to feature selection, IEEE Transaction on Evolutionary Computation, 20 (2016), 606-626.  doi: 10.1109/TEVC.2015.2504420.  Google Scholar [5] G. Chandrashekar and F. Sahin, A survey on feature selection methods, Electrical and Microelectronic Engineering, Rochester Institute of Technology, Rochester, NY 14623, USA, 2013. Google Scholar [6] B. Chizi, L. Rokach and O. Maimon, A survey of feature selection techniques, Encyclopedia of Data Warehousing and Mining, seconded, IGI Global, (2009), 1888-1895.   Google Scholar [7] L. Y. Chuang, H. W. Chang, C. J. Tu and C. H. Yang, Improved binary PSO for feature selection using gene expression data, Comput.Biol.Chem., 32 (2008), 29-38.   Google Scholar [8] G. Coath and S. K. Halgamuge, A comparison of constraint-handling methods for the application of particle swarm optimization to constrained nonlinear optimization problems, Proceedings of IEEE Congress on Evolutionary Computation 2003 (CEC 2003), Canbella, Australia, (2003), 2419–2425. Google Scholar [9] C. A. Coello Coello, E. H. Luna and A. H. Aguirre, Use of particle swarm optimization to design combinational logic circuits, International Conference on Evolvable Systems, (2003), 398-409.  doi: 10.1007/3-540-36553-2_36.  Google Scholar [10] C. Cotta, A study of hybridisation techniques and their application to the design of evolutionary algorithms, AI Communications, 11 (1998), 223-224.   Google Scholar [11] R. C. Eberhart and J. Kennedy, A new optimizer using particle swarm theory. Proceedings of the Sixth International Symposium on Micromachine and Human Science, Nagoya, Japan, (1995), 39–43. Google Scholar [12] E. Emary, H. M. Zawbaa, C. Grosan and A. E. Hassanien, Binary grey wolf optimization approaches for feature selection, Neurocomputing, Elsevier, 172 (2016), 371-381.   Google Scholar [13] A. Frank and A. Asuncion, UCI Machine Learning Repository, 2010. Google Scholar [14] J. Huang, Y. Cai and X. Xu, A hybrid genetic algorithm for feature selection wrapper based on mutual information, Pattern Recognition Letters archive, 28 (2007), 1825-1844.  doi: 10.1016/j.patrec.2007.05.011.  Google Scholar [15] J. Kennedy, R. C. Eberhart and Y. Shi, Swarm Intelligence, Morgan Kaufmann, SanMateo, CA, 2001. Google Scholar [16] S. Khalid, A survey of feature selection and feature extraction techniques in machine learning, Science and Information Conference (SAI), 2014. Google Scholar [17] R. A. Krohling, H. Knidel and Y. Shi, Solving numerical equations of hydraulic problems using particle swarm optimization, Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2002), Honolulu, Hawaii USA, 2002. Google Scholar [18] S. Mirjalili and A. Lewis, S-shaped versus V-shaped transfer functions for binary Particle Swarm Optimization, Swarm and Evolutionary Computation, 9 (2012), 1-14.   Google Scholar [19] S. Mirjalili, Dragonfly algorithm: A new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems, Neural Computing and Applications, 27 (2016), 1053-1073.   Google Scholar [20] R. Y. M. Nakamura, L. A. M. Pereira, K. A. Costa, D. Rodrigues, J. P. Papa and X.-S. Yang, Binary bat algorithm for feature selection, Conference on Graphics, Patterns and Images, (2012), 291-297.   Google Scholar [21] Q. Gu, Z. Li and J. Han, Generalized Fisher Score for Feature Selection, In Proc. of the 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, 2011. Google Scholar [22] E. G. Talbi, A taxonomy of hybrid metaheuristics, Journal of Heuristics, 8 (2002), 541-565.   Google Scholar [23] D. Wolpert and W. Macready, No free lunch theorems for optimization, IEEE Transactions on Evolutionary Computation, 1 (1997), 67-72.   Google Scholar

The Comparison of performance the HBDEPSO algorithm with other optimizers through main objectives of feature selection. The values are averaged over all the datasets
The Comparison of performance the HBDEPSO algorithm with other optimizers through few assessment indicators. The values are averaged over all the datasets
Datasets
 Dataset # of Attributes # of Instances Zoo 16 101 WineEW 13 178 IonosphereEW 34 351 WaveformEW 40 5000 BreastEW 30 569 Breastcancer 9 699 Congress 16 435 Exactly 13 1000 Exactly2 13 1000 HeartEW 13 270 KrvskpEW 36 3196 M-of-n 13 1000 SonarEW 60 208 SpectEW 60 208 Tic-tac-toe 9 958 Lymphography 18 148 Dermatology 34 366 Echocardiogram 12 132 hepatitis 19 155 LungCancer 56 32
Parameter setting
 Parameter Value No of iterations($max_{iter}$) 70 No of search agents($n$) 5 Dimension($D$) No. of features in the data Search domain [0 1] No of runs($M$) 10 $w_{max}$ 0.9 $w_{min}$ 0.4 $Deltax_{max}$ 6 $c_1$ 2 $c_2$ 2 $v_{max}$ 6 $\beta$ in fitness function 0.01 $\alpha$ in fitness function 0.99
Mean fitness function obtained from the different algorithms
 Dataset HBDESPO BDA EPSO BGA BBA BGWO2 HBEPSOD Zoo 0.040 0.067 0.031 0.124 0.094 0.119 0.082 Wine EW 0.036 0.050 0.042 0.065 0.128 0.092 0.041 IonosphereEW 0.110 0.130 0.137 0.143 0.146 0.172 0.115 WaveformEW 0.179 0.183 0.175 0.186 0.193 0.185 0.175 BreastEW 0.040 0.057 0.050 0.106 0.070 0.080 0.044 Breastcancer 0.023 0.032 0.032 0.036 0.035 0.042 0.030 Congress 0.028 0.042 0.033 0.059 0.053 0.073 0.036 Exactly 0.103 0.178 0.104 0.269 0.303 0.316 0.139 Exactly2 0.224 0.240 0.234 0.243 0.243 0.263 0.241 HeartEW 0.125 0.153 0.153 0.250 0.240 0.268 0.128 KrvskpEW 0.044 0.041 0.043 0.089 0.108 0.080 0.039 M-of-n 0.025 0.048 0.024 0.108 0.167 0.154 0.084 SonarEW 0.158 0.194 0.192 0.262 0.277 0.290 0.179 SpectEW 0.148 0.133 0.160 0.168 0.167 0.205 0.142 Tic-tac-toe 0.222 0.223 0.222 0.241 0.270 0.262 0.227 Lymphography 0.381 0.392 0.412 0.466 0.487 0.531 0.426 Dermatology 0.016 0.017 0.016 0.031 0.081 0.099 0.017 Echocardiogram 0.051 0.058 0.083 0.072 0.112 0.200 0.074 Hepatitis 0.118 0.101 0.123 0.152 0.175 0.192 0.115 LungCancer 0.219 0.255 0.220 0.318 0.427 0.455 0.291 Average 0.114 0.131 0.123 0.169 0.189 0.204 0.131
Best fitness function obtained from the different algorithms
 Dataset HBDESPO BDA EPSO BGA BBA BGWO2 HBEPSOD Zoo 0.000 0.000 0.001 0.032 0.005 0.035 0.004 Wine EW 0.002 0.003 0.019 0.035 0.021 0.003 0.019 IonosphereEW 0.071 0.108 0.113 0.114 0.079 0.089 0.096 WaveformEW 0.171 0.181 0.165 0.174 0.176 0.167 0.162 BreastEW 0.025 0.055 0.027 0.060 0.045 0.056 0.034 Breastcancer 0.014 0.024 0.018 0.029 0.024 0.027 0.014 Congress 0.016 0.019 0.022 0.038 0.029 0.045 0.022 Exactly 0.004 0.040 0.025 0.058 0.270 0.298 0.025 Exactly2 0.211 0.235 0.219 0.216 0.212 0.241 0.220 HeartEW 0.091 0.082 0.104 0.147 0.168 0.147 0.082 KrvskpEW 0.041 0.034 0.033 0.041 0.060 0.059 0.029 M-of-n 0.004 0.004 0.004 0.067 0.113 0.128 0.004 SonarEW 0.118 0.156 0.118 0.220 0.205 0.234 0.134 SpectEW 0.115 0.093 0.125 0.125 0.127 0.161 0.115 Tic-tac-toe 0.213 0.206 0.185 0.217 0.236 0.242 0.196 Lymphography 0.286 0.344 0.307 0.388 0.427 0.450 0.349 Dermatology 0.003 0.003 0.004 0.012 0.029 0.046 0.004 Echocardiogram 0.003 0.025 0.047 0.045 0.049 0.093 0.047 Hepatitis 0.058 0.058 0.080 0.078 0.117 0.097 0.061 LungCancer 0.093 0.003 0.058 0.093 0.184 0.28 0.094 Average 0.077 0.084 0.084 0.110 0.129 0.145 0.086
Worst fitness function obtained from the different algorithms
 Dataset HBDESPO BDA EPSO BGA BBA BGWO2 HBEPSOD Zoo 0.121 0.208 0.089 0.208 0.208 0.208 0.206 Wine EW 0.069 0.119 0.070 0.122 0.273 0.157 0.070 IonosphereEW 0.146 0.155 0.171 0.189 0.191 0.309 0.147 WaveformEW 0.184 0.192 0.186 0.197 0.215 0.195 0.186 BreastEW 0.049 0.065 0.081 0.315 0.103 0.115 0.054 Breastcancer 0.031 0.039 0.041 0.049 0.049 0.052 0.038 Congress 0.043 0.063 0.049 0.085 0.092 0.089 0.049 Exactly 0.213 0.308 0.251 0.349 0.326 0.342 0.294 Exactly2 0.238 0.263 0.248 0.268 0.276 0.286 0.265 HeartEW 0.168 0.201 0.289 0.322 0.334 0.357 0.168 KrvskpEW 0.047 0.052 0.054 0.177 0.191 0.101 0.063 M-of-n 0.049 0.136 0.073 0.157 0.232 0.170 0.461 SonarEW 0.191 0.234 0.219 0.306 0.391 0.349 0.262 SpectEW 0.170 0.170 0.204 0.205 0.216 0.238 0.192 Tic-tac-toe 0.236 0.239 0.244 0.275 0.313 0.298 0.243 Lymphography 0.468 0.491 0.469 0.588 0.569 0.581 0.549 Dermatology 0.029 0.053 0.029 0.061 0.290 0.222 0.030 Echocardiogram 0.070 0.092 0.16 0.114 0.23 0.840 0.115 Hepatitis 0.230 0.138 0.174 0.212 0.234 0.253 0.175 LungCancer 0.542 0.454 0.543 0.723 0.813 0.545 0.722 Average 0.165 0.184 0.182 0.246 0.277 0.285 0.214
Standard deviation of the fitness function obtained from the different algorithms
 Dataset HBDESPO BDA EPSO BGA BBA BGWO2 HBEPSOD Zoo 0.052 0.075 0.033 0.066 0.070 0.067 0.056 Wine EW 0.019 0.030 0.018 0.026 0.080 0.057 0.017 IonosphereEW 0.022 0.018 0.016 0.025 0.040 0.057 0.013 WaveformEW 0.003 0.006 0.008 0.008 0.0123 0.007 0.006 BreastEW 0.006 0.007 0.019 0.755 0.017 0.018 0.006 Breastcancer 0.005 0.005 0.007 0.007 0.009 0.008 0.009 Congress 0.007 0.016 0.008 0.013 0.019 0.015 0.008 Exactly 0.071 0.119 0.082 0.078 0.020 0.016 0.117 Exactly2 0.009 0.015 0.009 0.019 0.017 0.018 0.015 HeartEW 0.025 0.036 0.055 0.062 0.064 0.069 0.025 KrvskpEW 0.002 0.007 0.007 0.051 0.044 0.012 0.010 M-of-n 0.018 0.051 0.022 0.032 0.036 0.019 0.136 SonarEW 0.027 0.033 0.029 0.030 0.059 0.043 0.037 SpectEW 0.016 0.022 0.027 0.029 0.028 0.024 0.029 Tic-tac-toe 0.007 0.012 0.020 0.021 0.025 0.017 0.014 Lymphography 0.052 0.049 0.048 0.062 0.047 0.044 0.055 Dermatology 0.007 0.014 0.008 0.014 0.075 0.050 0.008 Echocardiogram 0.024 0.026 0.030 0.024 0.055 0.228 0.025 Hepatitis 0.050 0.025 0.028 0.038 0.043 0.052 0.030 LungCancer 0.092 0.151 0.180 0.233 0.194 0.093 0.183 Average 0.026 0.036 0.033 0.080 0.048 0.046 0.040
Average performance of the selected features by different algorithms
 Dataset HBDESPO BDA EPSO BGA BBA BGWO2 HBEPSOD Zoo 0.844 0.788 0.791 0.863 0.799 0.851 0.852 Wine EW 0.916 0.923 0.881 0.886 0.726 0.896 0.888 IonosphereEW 0.835 0.799 0.829 0.828 0.817 0.824 0.810 WaveformEW 0.823 0.807 0.809 0.806 0.779 0.819 0.806 BreastEW 0.949 0.944 0.931 0.892 0.842 0.908 0.926 Breastcancer 0.960 0.956 0.956 0.957 0.957 0.957 0.958 Congress 0.945 0.931 0.943 0.915 0.893 0.928 0.935 Exactly 0.895 0.798 0.884 0.687 0.647 0.680 0.846 Exactly2 0.746 0.739 0.738 0.734 0.711 0.732 0.736 HeartEW 0.815 0.81 0.776 0.711 0.648 0.702 0.811 KrvskpEW 0.959 0.954 0.958 0.906 0.772 0.917 0.958 M-of-n 0.978 0.949 0.975 0.892 0.719 0.843 0.957 SonarEW 0.705 0.658 0.682 0.694 0.678 0.682 0.682 SpectEW 0.762 0.752 0.757 0.750 0.755 0.777 0.747 Tic-tac-toe 0.748 0.745 0.740 0.734 0.647 0.713 0.737 Lymphography 0.406 0.417 0.354 0.416 0.422 0.379 0.411 Dermatology 0.958 0.940 0.952 0.95 0.802 0.908 0.945 Echocardiogram 0.875 0.893 0.906 0.852 0.861 0.877 0.863 Hepatitis 0.819 0.788 0.813 0.798 0.788 0.788 0.803 LungCancer 0.481 0.427 0.390 0.409 0.343 0.345 0.345 Average 0.821 0.801 0.803 0.784 0.730 0.776 0.801
Average selected feature ratio by different algorithms
 Dataset HBDESPO BDA EPSO BGA BBA BGWO2 HBEPSOD Zoo 0.293 0.331 0.356 0.412 0.512 0.473 0.4 Wine EW 0.284 0.338 0.4 0.315 0.538 0.516 0.338 IonosphereEW 0.367 0.397 0.388 0.402 0.526 0.541 0.397 WaveformEW 0.633 0.666 0.709 0.676 0.634 1 0.752 BreastEW 0.241 0.283 0.241 0.290 0.480 0.470 0.3 Breastcancer 0.411 0.422 0.511 0.566 0.511 0.644 0.544 Congress 0.306 0.337 0.325 0.412 0.493 0.575 0.318 Exactly 0.469 0.507 0.507 0.561 0.538 0.576 0.523 Exactly2 0.392 0.392 0.492 0.4 0.546 0.8 0.415 HeartEW 0.391 0.407 0.407 0.415 0.492 0.430 0.4 KrvskpEW 0.486 0.475 0.502 0.530 0.513 0.633 0.516 M-of-n 0.515 0.530 0.476 0.576 0.446 0.923 0.515 SonarEW 0.44 0.413 0.463 0.42 0.521 0.533 0.475 SpectEW 0.413 0.454 0.463 0.425 0.481 0.529 0.440 Tic-tac-toe 0.555 0.555 0.666 0.511 0.577 0.866 0.533 Lymphography 0.39 0.438 0.416 0.4 0.461 0.535 0.45 Dermatology 0.5 0.411 0.511 0.479 0.494 0.544 0.5 Echocardiogram 0.225 0.233 0.266 0.283 0.508 0.483 0.25 Hepatitis 0.273 0.273 0.321 0.231 0.515 0.431 0.294 LungCancer 0.35 0.353 0.423 0.380 0.498 0.526 0.357 Average 0.397 0.411 0.442 0.434 0.514 0.601 0.436
Average Fischer index of the selected features by different algorithms
 Dataset HBDESPO BDA EPSO BGA BBA BGWO2 HBEPSOD Zoo 161 105 143 156 112 130 140 Wine EW 24.13 374.67 648.48 540.52 11784.7 20939.7 41.169 IonosphereEW 3.602 3.986 3.870 4.769 4.154 5.042 4.056 WaveformEW 2.355 2.314 2.314 2.165 2.029 3.456 2.454 BreastEW 7.2E+13 2.5E+11 3.3E+13 1.4E+13 5.7E+12 6.9E+13 3.1E+11 Breastcancer 1.190 0.748 1.070 0.923 0.942 1.105 0.884 Congress 48.584 31.797 13.996 13.045 11.003 18.317 22.088 Exactly 0.391 0.259 0.131 0.378 0.350 0.282 0.144 Exactly2 0.395 0.240 0.267 0.287 0.200 0.237 0.227 HeartEW 3.788 3.424 3.357 140.64 161.62 430.07 2.197 KrvskpEW 1396.5 544.21 940.24 1023.2 639.91 1187.5 913.89 M-of-n 1.791 1.711 1.735 1.786 1.652 1.373 1.693 SonarEW 6.4E+6 7.3E+6 8.2E+6 5.5E+6 8.2E+6 1.2E+7 9.5E+6 SpectEW 0.008 0.006 0.005 0.004 0.006 0.006 0.006 Tic-tac-toe 0.168 0.090 0.161 0.119 0.136 0.117 0.134 Lymphography 9.77 3.13 9.18 2.43 4.41 2.73 3.51 Dermatology 400 269 343 148 210 174 207 Echocardiogram 158.28 579.06 1376 62931 130939 53037 985.60 Hepatitis 5.963 3.491 51.80 14.420 132.03 53037 84.211 LungCancer 42.973 31.148 40.203 29.405 30.810 33.615 22.220 Average 3.6E+12 1.3E+10 1.6E+12 6.9E+11 2.8E+11 3.4E+11 1.5E+10
