-
Previous Article
The nonexistence of global solution for system of q-difference inequalities
- MFC Home
- This Issue
-
Next Article
Modal additive models with data-driven structure identification
Averaging versus voting: A comparative study of strategies for distributed classification
Department of Mathematical Sciences, Middle Tennessee State University, 1301 E Main Street, Murfreesboro, TN 37132, USA |
In this paper we proposed two strategies, averaging and voting, to implement distributed classification via the divide and conquer approach. When a data set is too big to be processed by one processor or is naturally stored in different locations, the method partitions the whole data into multiple subsets randomly or according to their locations. Then a base classification algorithm is applied to each subset to produce a local classification model. Finally, averaging or voting is used to couple the local models together to produce the final classification model. We performed thorough empirical studies to compare the two strategies. The results show that averaging is more effective in most scenarios.
References:
[1] |
R. G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David and C. E. Elger, Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state, Physical Review E, 64 (2001), 061907.
doi: 10.1103/PhysRevE.64.061907. |
[2] |
N. Aronszajn,
Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337-404.
doi: 10.1090/S0002-9947-1950-0051437-7. |
[3] |
R. K. Bock, A. Chilingarian, M. Gaug, F. Hakl, T. Hengstebeck, M. Jiřina, J. Klaschka, E. Kotrč, P. Savickỳ, S. Towers, A. Vaiciulis and W. Wittek,
Methods for multidimensional event classification: a case study using images from a cherenkov gamma-ray telescope, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 516 (2004), 511-528.
doi: 10.1016/j.nima.2003.08.157. |
[4] |
C. Cortes and V. Vapnik,
Support-vector networks, Machine Learning, 20 (1995), 273-297.
doi: 10.1007/BF00994018. |
[5] |
F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, Cambridge, 2007.
doi: 10.1017/CBO9780511618796.![]() ![]() |
[6] |
J. Friedman, T. Hastie and R. Tibshirani, The Elements of Statistical Learning. Data Mining, Inference, and Prediction, Springer Series in Statistics, Springer-Verlag, New York, 2001.
doi: 10.1007/978-0-387-21606-5. |
[7] |
I. Goodfellow, Y. Bengio and A. Courville, Deep learning, MIT Press, Cambridge, MA, 2016.
![]() |
[8] |
X. Guo, T. Hu and Q. Wu, Distributed minimum error entropy algorithms, preprint, (2020). Google Scholar |
[9] |
Z.-C. Guo, S.-B. Lin and D.-X. Zhou, Learning theory of distributed spectral algorithms, Inverse Problems, 33 (2017), 074009.
doi: 10.1088/1361-6420/aa72b2. |
[10] |
Z.-C. Guo, L. Shi and Q. Wu,
Learning theory of distributed regression with bias corrected regularization kernel network, Journal of Machine Learning Research, 18 (2017), 1-25.
|
[11] |
Z.-C. Guo, D.-H. Xiang, X. Guo and D.-X. Zhou,
Thresholded spectral algorithms for sparse approximations, Analysis and Applications, 15 (2017), 433-455.
doi: 10.1142/S0219530517500026. |
[12] |
T. Hu, Q. Wu and D.-X. Zhou,
Distributed kernel gradient descent algorithm for minimum error entropy principle, Applied and Computational Harmonic Analysis, 49 (2020), 229-256.
doi: 10.1016/j.acha.2019.01.002. |
[13] |
B. A. Johnson, R. Tateishi and N. T. Hoan,
A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees, International Journal of Remote Sensing, 34 (2013), 6969-6982.
doi: 10.1080/01431161.2013.810825. |
[14] |
S.-B. Lin, X. Guo and D.-X. Zhou,
Distributed learning with regularized least squares, Journal of Machine Learning Research, 18 (2017), 1-31.
|
[15] |
E. C. Ozan, E. Riabchenko, S. Kiranyaz and M. Gabbouj, An optimized k-nn approach for classification on imbalanced datasets with missing data, in International Symposium on Intelligent Data Analysis, Springer, (2016), 387–392.
doi: 10.1007/978-3-319-46349-0_34. |
[16] |
J. Platt, Fast training of support vector machines using sequential minimal optimization, Advances in Kernel Methods - Support Vector Learning, MIT Press, Cambridge, MA, (1999), 185–208. Google Scholar |
[17] |
J. G. Rohra, B. Perumal, S. J. Narayanan, P. Thakur and R. B. Bhatt, User localization in an indoor environment using fuzzy hybrid of particle swarm optimization & gravitational search algorithm with neural networks, in Proceedings of Sixth International Conference on Soft Computing for Problem Solving, Springer, (2017), 286–295.
doi: 10.1007/978-981-10-3322-3_27. |
[18] |
J. D. Rosenblatt and B. Nadler,
On the optimality of averaging in distributed statistical learning, Information and Inference: A Journal of the IMA, 5 (2016), 379-404.
doi: 10.1093/imaiai/iaw013. |
[19] |
E. R. Sparks, A. Talwalkar, V. Smith, J. Kottalam, X. Pan, J. Gonzalez, M. J. Franklin, M. I. Jordan and T. Kraska, Mli: An api for distributed machine learning, in 2013 IEEE 13th International Conference on Data Mining, IEEE, (2013), 1187–1192.
doi: 10.1109/ICDM.2013.158. |
[20] |
I. Steinwart,
Support vector machines are universally consistent, Journal of Complexity, 18 (2002), 768-791.
doi: 10.1006/jcom.2002.0642. |
[21] |
I. Steinwart and A. Christmann, Support Vector Machines, Springer Science & Business Media, 2008. |
[22] |
V. Vapnik, Statistical learning theory, John Wiley & Sons, Inc., New York, 1998. |
[23] |
Q. Wu, Y. Ying and D.-X. Zhou,
Multi-kernel regularized classifiers, Journal of Complexity, 23 (2007), 108-134.
|
[24] |
Q. Wu and D.-X. Zhou,
Analysis of support vector machine classification, Journal of Computational Analysis & Applications, 8 (2006), 99-119.
|
[25] |
I.-C. Yeh and C.-H. Lien,
The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients, Expert Systems with Applications, 36 (2009), 2473-2480.
doi: 10.1016/j.eswa.2007.12.020. |
[26] |
T. Zhang,
Statistical behavior and consistency of classification methods based on convex risk minimization, Annals of Statistics, 32 (2004), 56-85.
doi: 10.1214/aos/1079120130. |
[27] |
Y. Zhang, J. C. Duchi and M. J. Wainwright,
Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates, Journal of Machine Learning Research, 16 (2015), 3299-3340.
|
show all references
References:
[1] |
R. G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David and C. E. Elger, Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state, Physical Review E, 64 (2001), 061907.
doi: 10.1103/PhysRevE.64.061907. |
[2] |
N. Aronszajn,
Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337-404.
doi: 10.1090/S0002-9947-1950-0051437-7. |
[3] |
R. K. Bock, A. Chilingarian, M. Gaug, F. Hakl, T. Hengstebeck, M. Jiřina, J. Klaschka, E. Kotrč, P. Savickỳ, S. Towers, A. Vaiciulis and W. Wittek,
Methods for multidimensional event classification: a case study using images from a cherenkov gamma-ray telescope, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 516 (2004), 511-528.
doi: 10.1016/j.nima.2003.08.157. |
[4] |
C. Cortes and V. Vapnik,
Support-vector networks, Machine Learning, 20 (1995), 273-297.
doi: 10.1007/BF00994018. |
[5] |
F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, Cambridge, 2007.
doi: 10.1017/CBO9780511618796.![]() ![]() |
[6] |
J. Friedman, T. Hastie and R. Tibshirani, The Elements of Statistical Learning. Data Mining, Inference, and Prediction, Springer Series in Statistics, Springer-Verlag, New York, 2001.
doi: 10.1007/978-0-387-21606-5. |
[7] |
I. Goodfellow, Y. Bengio and A. Courville, Deep learning, MIT Press, Cambridge, MA, 2016.
![]() |
[8] |
X. Guo, T. Hu and Q. Wu, Distributed minimum error entropy algorithms, preprint, (2020). Google Scholar |
[9] |
Z.-C. Guo, S.-B. Lin and D.-X. Zhou, Learning theory of distributed spectral algorithms, Inverse Problems, 33 (2017), 074009.
doi: 10.1088/1361-6420/aa72b2. |
[10] |
Z.-C. Guo, L. Shi and Q. Wu,
Learning theory of distributed regression with bias corrected regularization kernel network, Journal of Machine Learning Research, 18 (2017), 1-25.
|
[11] |
Z.-C. Guo, D.-H. Xiang, X. Guo and D.-X. Zhou,
Thresholded spectral algorithms for sparse approximations, Analysis and Applications, 15 (2017), 433-455.
doi: 10.1142/S0219530517500026. |
[12] |
T. Hu, Q. Wu and D.-X. Zhou,
Distributed kernel gradient descent algorithm for minimum error entropy principle, Applied and Computational Harmonic Analysis, 49 (2020), 229-256.
doi: 10.1016/j.acha.2019.01.002. |
[13] |
B. A. Johnson, R. Tateishi and N. T. Hoan,
A hybrid pansharpening approach and multiscale object-based image analysis for mapping diseased pine and oak trees, International Journal of Remote Sensing, 34 (2013), 6969-6982.
doi: 10.1080/01431161.2013.810825. |
[14] |
S.-B. Lin, X. Guo and D.-X. Zhou,
Distributed learning with regularized least squares, Journal of Machine Learning Research, 18 (2017), 1-31.
|
[15] |
E. C. Ozan, E. Riabchenko, S. Kiranyaz and M. Gabbouj, An optimized k-nn approach for classification on imbalanced datasets with missing data, in International Symposium on Intelligent Data Analysis, Springer, (2016), 387–392.
doi: 10.1007/978-3-319-46349-0_34. |
[16] |
J. Platt, Fast training of support vector machines using sequential minimal optimization, Advances in Kernel Methods - Support Vector Learning, MIT Press, Cambridge, MA, (1999), 185–208. Google Scholar |
[17] |
J. G. Rohra, B. Perumal, S. J. Narayanan, P. Thakur and R. B. Bhatt, User localization in an indoor environment using fuzzy hybrid of particle swarm optimization & gravitational search algorithm with neural networks, in Proceedings of Sixth International Conference on Soft Computing for Problem Solving, Springer, (2017), 286–295.
doi: 10.1007/978-981-10-3322-3_27. |
[18] |
J. D. Rosenblatt and B. Nadler,
On the optimality of averaging in distributed statistical learning, Information and Inference: A Journal of the IMA, 5 (2016), 379-404.
doi: 10.1093/imaiai/iaw013. |
[19] |
E. R. Sparks, A. Talwalkar, V. Smith, J. Kottalam, X. Pan, J. Gonzalez, M. J. Franklin, M. I. Jordan and T. Kraska, Mli: An api for distributed machine learning, in 2013 IEEE 13th International Conference on Data Mining, IEEE, (2013), 1187–1192.
doi: 10.1109/ICDM.2013.158. |
[20] |
I. Steinwart,
Support vector machines are universally consistent, Journal of Complexity, 18 (2002), 768-791.
doi: 10.1006/jcom.2002.0642. |
[21] |
I. Steinwart and A. Christmann, Support Vector Machines, Springer Science & Business Media, 2008. |
[22] |
V. Vapnik, Statistical learning theory, John Wiley & Sons, Inc., New York, 1998. |
[23] |
Q. Wu, Y. Ying and D.-X. Zhou,
Multi-kernel regularized classifiers, Journal of Complexity, 23 (2007), 108-134.
|
[24] |
Q. Wu and D.-X. Zhou,
Analysis of support vector machine classification, Journal of Computational Analysis & Applications, 8 (2006), 99-119.
|
[25] |
I.-C. Yeh and C.-H. Lien,
The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients, Expert Systems with Applications, 36 (2009), 2473-2480.
doi: 10.1016/j.eswa.2007.12.020. |
[26] |
T. Zhang,
Statistical behavior and consistency of classification methods based on convex risk minimization, Annals of Statistics, 32 (2004), 56-85.
doi: 10.1214/aos/1079120130. |
[27] |
Y. Zhang, J. C. Duchi and M. J. Wainwright,
Divide and conquer kernel ridge regression: A distributed algorithm with minimax optimal rates, Journal of Machine Learning Research, 16 (2015), 3299-3340.
|
Classification Task | Number of Observations | Number of Features |
Default of Credit Card Clients | 30,000 | 23 |
Wilt Diseased Tree Detection | 4,889 | 5 |
APS Failure | 60,000 | 170 |
MAGIC Gamma Telescope | 19,020 | 10 |
Spam Email Detection | 4,601 | 57 |
Epileptic Seizures | 9,200 | 178 |
Wireless Localization {1, 2} vs {3, 4} | 2,000 | 7 |
Student Evaluation {1, 2} vs {3, 4, 5} | 5,046 | 32 |
Handwritten Digits 5 vs 8 | 12,017 | 786 |
Classification Task | Number of Observations | Number of Features |
Default of Credit Card Clients | 30,000 | 23 |
Wilt Diseased Tree Detection | 4,889 | 5 |
APS Failure | 60,000 | 170 |
MAGIC Gamma Telescope | 19,020 | 10 |
Spam Email Detection | 4,601 | 57 |
Epileptic Seizures | 9,200 | 178 |
Wireless Localization {1, 2} vs {3, 4} | 2,000 | 7 |
Student Evaluation {1, 2} vs {3, 4, 5} | 5,046 | 32 |
Handwritten Digits 5 vs 8 | 12,017 | 786 |
Classification Task | Voting | Averaging | p-value |
Default of Credit Card Clients | 73.71 | 80.15 | <2.2e-16 |
Wilt Diseased Tree Detection | 95.42 | 96.94 | <2.2e-16 |
APS Failure | 98.39 | 98.75 | <2.2e-16 |
MAGIC Gamma Telescope | 79.18 | 79.18 | 0.9845 |
Spam Email Detection | 61.52 | 92.83 | <2.2e-16 |
Epileptic Seizure | 50.10 | 66.11 | <2.2e-16 |
Wireless Localization {1, 2} vs {3, 4} | 91.77 | 95.15 | <2.2e-16 |
Student Evaluation {1, 2} vs {3, 4, 5} | 91.81 | 95.17 | <2.2e-16 |
Handwritten Digits 5 vs 8 | 84.46 | 95.84 | <2.2e-16 |
Classification Task | Voting | Averaging | p-value |
Default of Credit Card Clients | 73.71 | 80.15 | <2.2e-16 |
Wilt Diseased Tree Detection | 95.42 | 96.94 | <2.2e-16 |
APS Failure | 98.39 | 98.75 | <2.2e-16 |
MAGIC Gamma Telescope | 79.18 | 79.18 | 0.9845 |
Spam Email Detection | 61.52 | 92.83 | <2.2e-16 |
Epileptic Seizure | 50.10 | 66.11 | <2.2e-16 |
Wireless Localization {1, 2} vs {3, 4} | 91.77 | 95.15 | <2.2e-16 |
Student Evaluation {1, 2} vs {3, 4, 5} | 91.81 | 95.17 | <2.2e-16 |
Handwritten Digits 5 vs 8 | 84.46 | 95.84 | <2.2e-16 |
Classification Task | Voting | Averaging | p-value |
Default of Credit Card Clients | 79.29 | 79.48 | 9.2e-05 |
Wilt Diseased Tree Detection | 96.83 | 97.19 | 4.6e-08 |
APS Failure | 98.52 | 98.60 | <2.2e-16 |
MAGIC Gamma Telescope | 86.59 | 86.64 | 0.2107 |
Spam Email Detection | 93.20 | 93.47 | 0.0001 |
Epileptic Seizure | 89.16 | 89.46 | 0.0008 |
Wireless Localization {1, 2} vs {3, 4} | 95.42 | 95.47 | 0.3773 |
Student Evaluation {1, 2} vs {3, 4, 5} | 95.31 | 95.34 | 0.6433 |
Handwritten Digits 5 vs 8 | 99.50 | 99.54 | 2.3e-05 |
Classification Task | Voting | Averaging | p-value |
Default of Credit Card Clients | 79.29 | 79.48 | 9.2e-05 |
Wilt Diseased Tree Detection | 96.83 | 97.19 | 4.6e-08 |
APS Failure | 98.52 | 98.60 | <2.2e-16 |
MAGIC Gamma Telescope | 86.59 | 86.64 | 0.2107 |
Spam Email Detection | 93.20 | 93.47 | 0.0001 |
Epileptic Seizure | 89.16 | 89.46 | 0.0008 |
Wireless Localization {1, 2} vs {3, 4} | 95.42 | 95.47 | 0.3773 |
Student Evaluation {1, 2} vs {3, 4, 5} | 95.31 | 95.34 | 0.6433 |
Handwritten Digits 5 vs 8 | 99.50 | 99.54 | 2.3e-05 |
[1] |
Ying Lin, Qi Ye. Support vector machine classifiers by non-Euclidean margins. Mathematical Foundations of Computing, 2020, 3 (4) : 279-300. doi: 10.3934/mfc.2020018 |
[2] |
Yuyuan Ouyang, Trevor Squires. Some worst-case datasets of deterministic first-order methods for solving binary logistic regression. Inverse Problems & Imaging, 2021, 15 (1) : 63-77. doi: 10.3934/ipi.2020047 |
[3] |
Jianhua Huang, Yanbin Tang, Ming Wang. Singular support of the global attractor for a damped BBM equation. Discrete & Continuous Dynamical Systems - B, 2020 doi: 10.3934/dcdsb.2020345 |
[4] |
Bilel Elbetch, Tounsia Benzekri, Daniel Massart, Tewfik Sari. The multi-patch logistic equation. Discrete & Continuous Dynamical Systems - B, 2021 doi: 10.3934/dcdsb.2021025 |
[5] |
Min Ji, Xinna Ye, Fangyao Qian, T.C.E. Cheng, Yiwei Jiang. Parallel-machine scheduling in shared manufacturing. Journal of Industrial & Management Optimization, 2020 doi: 10.3934/jimo.2020174 |
[6] |
Manxue You, Shengjie Li. Perturbation of Image and conjugate duality for vector optimization. Journal of Industrial & Management Optimization, 2020 doi: 10.3934/jimo.2020176 |
[7] |
Riadh Chteoui, Abdulrahman F. Aljohani, Anouar Ben Mabrouk. Classification and simulation of chaotic behaviour of the solutions of a mixed nonlinear Schrödinger system. Electronic Research Archive, , () : -. doi: 10.3934/era.2021002 |
[8] |
Wen Li, Wei-Hui Liu, Seak Weng Vong. Perron vector analysis for irreducible nonnegative tensors and its applications. Journal of Industrial & Management Optimization, 2021, 17 (1) : 29-50. doi: 10.3934/jimo.2019097 |
[9] |
Liping Tang, Ying Gao. Some properties of nonconvex oriented distance function and applications to vector optimization problems. Journal of Industrial & Management Optimization, 2021, 17 (1) : 485-500. doi: 10.3934/jimo.2020117 |
[10] |
Kengo Nakai, Yoshitaka Saiki. Machine-learning construction of a model for a macroscopic fluid variable using the delay-coordinate of a scalar observable. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 1079-1092. doi: 10.3934/dcdss.2020352 |
[11] |
Touria Karite, Ali Boutoulout. Global and regional constrained controllability for distributed parabolic linear systems: RHUM approach. Numerical Algebra, Control & Optimization, 2020 doi: 10.3934/naco.2020055 |
[12] |
Feifei Cheng, Ji Li. Geometric singular perturbation analysis of Degasperis-Procesi equation with distributed delay. Discrete & Continuous Dynamical Systems - A, 2021, 41 (2) : 967-985. doi: 10.3934/dcds.2020305 |
[13] |
Dominique Chapelle, Philippe Moireau, Patrick Le Tallec. Robust filtering for joint state-parameter estimation in distributed mechanical systems. Discrete & Continuous Dynamical Systems - A, 2009, 23 (1&2) : 65-84. doi: 10.3934/dcds.2009.23.65 |
[14] |
Yicheng Liu, Yipeng Chen, Jun Wu, Xiao Wang. Periodic consensus in network systems with general distributed processing delays. Networks & Heterogeneous Media, 2020 doi: 10.3934/nhm.2021002 |
[15] |
Baoli Yin, Yang Liu, Hong Li, Zhimin Zhang. Approximation methods for the distributed order calculus using the convolution quadrature. Discrete & Continuous Dynamical Systems - B, 2021, 26 (3) : 1447-1468. doi: 10.3934/dcdsb.2020168 |
[16] |
Lu Xu, Chunlai Mu, Qiao Xin. Global boundedness of solutions to the two-dimensional forager-exploiter model with logistic source. Discrete & Continuous Dynamical Systems - A, 2020 doi: 10.3934/dcds.2020396 |
[17] |
Wenbin Lv, Qingyuan Wang. Global existence for a class of Keller-Segel models with signal-dependent motility and general logistic term. Evolution Equations & Control Theory, 2021, 10 (1) : 25-36. doi: 10.3934/eect.2020040 |
[18] |
Lei Yang, Lianzhang Bao. Numerical study of vanishing and spreading dynamics of chemotaxis systems with logistic source and a free boundary. Discrete & Continuous Dynamical Systems - B, 2021, 26 (2) : 1083-1109. doi: 10.3934/dcdsb.2020154 |
[19] |
Kuo-Chih Hung, Shin-Hwa Wang. Classification and evolution of bifurcation curves for a porous-medium combustion problem with large activation energy. Communications on Pure & Applied Analysis, 2021, 20 (2) : 559-582. doi: 10.3934/cpaa.2020281 |
[20] |
Jianfeng Huang, Haihua Liang. Limit cycles of planar system defined by the sum of two quasi-homogeneous vector fields. Discrete & Continuous Dynamical Systems - B, 2021, 26 (2) : 861-873. doi: 10.3934/dcdsb.2020145 |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]