American Institute of Mathematical Sciences

August  2020, 19(8): 3947-3956. doi: 10.3934/cpaa.2020174

Quantitative robustness of localized support vector machines

 Federal Statistical Office of Germany, Gustav-Stresemann-Ring 11, 65189 Wiesbaden, and, University of Bayreuth, Department of Mathematics, 95440 Bayreuth, Germany

Received  March 2019 Revised  July 2019 Published  May 2020

Fund Project: The work was partially supported by grant CH 291/3-1 of the Deutsche Forschungsgemeinschaft (DFG)

The huge amount of available data nowadays is a challenge for kernel-based machine learning algorithms like SVMs with respect to runtime and storage capacities. Local approaches might help to relieve these issues and to improve statistical accuracy. It has already been shown that these local approaches are consistent and robust in a basic sense. This article refines the analysis of robustness properties towards the so-called influence function which expresses the differentiability of the learning method: We show that there is a differentiable dependency of our locally learned predictor on the underlying distribution. The assumptions of the proven theorems can be verified without knowing anything about this distribution. This makes the results interesting also from an applied point of view.

Citation: Florian Dumpert. Quantitative robustness of localized support vector machines. Communications on Pure & Applied Analysis, 2020, 19 (8) : 3947-3956. doi: 10.3934/cpaa.2020174
References:
 [1] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337-404.  doi: 10.2307/1990404.  Google Scholar [2] R. B. Ash and C. Doleans-Dade, Probability and Measure Theory, Academic Press, an Diego, 2000.   Google Scholar [3] A. Berlinet and C. Thomas-Agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics, Springer, New York, 2001. doi: 10.1007/978-1-4419-9096-9.  Google Scholar [4] B. E. Boser, I. M. Guyon and V. N. Vapnik, A training algorithm for optimal margin classifiers, in Proceedings of The Fifth Annual Workshop on Computational Learning Theory, (1992), 144–152. Google Scholar [5] A. Christmann and I. Steinwart, On robust properties of convex risk minimization methods for pattern recognition, J. Mach. Learn. Res., 5 (2004), 1007-1034.   Google Scholar [6] A. Christmann, I. Steinwart and M. Hubert, Robust learning from bites for data mining, Comput. Stat. Data Anal., 52 (2007), 347-361.  doi: 10.1016/j.csda.2006.12.009.  Google Scholar [7] A. Christmann and A. van Messem, Bouligand derivatives and robustness of support vector machines for regression, J. Mach. Learn. Res., 9 (2008), 915-936.   Google Scholar [8] A. Christmann, A. Van Messem and I. Steinwart, On consistency and robustness properties of support vector machines for heavy-tailed distributions, Stat. Interface, 2 (2009), 311-327.  doi: 10.4310/SII.2009.v2.n3.a5.  Google Scholar [9] C. Cortes and V. Vapnik, Support-vector networks, Mach. learn., 20 (1995), 273-297.   Google Scholar [10] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, 2000.   Google Scholar [11] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar [12] Z. Denkowski, S. Mig$\acute{o}$rski and N. S. Papageorgiou, An Introduction to Nonlinear Analysis: Theory, Kluwer Academic/Plenum Publishers, New York, 2003. doi: 10.1007/978-1-4419-9158-4.  Google Scholar [13] F. Dumpert and A. Christmann, Universal consistency and robustness of localized support vector machines, Neurocomputing, 315 (2018), 96-106.   Google Scholar [14] N. Dunford and J. T. Schwartz, Linear Operators, Part I, Interscience Publishers, New York, 1958.  Google Scholar [15] R. Hable and A. Christmann, Robustness versus consistency in ill-posed classification and regression problems, in Classification and Data Mining (eds. A. Giusti, G. Ritter and M. Vichi), Springer, Berlin, (2013), 27–35.  Google Scholar [16] F. R. Hampel, Contributions to the theory of robust estimation, Ph.D thesis, University of California, Berkeley, 1968  Google Scholar [17] Y. Ma and G. Guo, Support Vector Machines Applications, Springer, New York, 2014. Google Scholar [18] B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT press, Cambridge, 2001.   Google Scholar [19] I. Steinwart and A. Christmann, Support Vector Machines, Springer, New York, 2008.  Google Scholar [20] A. Van Messem and A. Christmann, A review on consistency and robustness properties of support vector machines for heavy-tailed distributions, Adv. Data Anal. Classif., 4 (2010), 199-220.  doi: 10.1007/s11634-010-0067-2.  Google Scholar [21] Z. Wu, Compactly supported positive definite radial functions, Adv. Comput. Math., 4 (1995), 283-292.  doi: 10.1007/BF03177517.  Google Scholar

show all references

References:
 [1] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337-404.  doi: 10.2307/1990404.  Google Scholar [2] R. B. Ash and C. Doleans-Dade, Probability and Measure Theory, Academic Press, an Diego, 2000.   Google Scholar [3] A. Berlinet and C. Thomas-Agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics, Springer, New York, 2001. doi: 10.1007/978-1-4419-9096-9.  Google Scholar [4] B. E. Boser, I. M. Guyon and V. N. Vapnik, A training algorithm for optimal margin classifiers, in Proceedings of The Fifth Annual Workshop on Computational Learning Theory, (1992), 144–152. Google Scholar [5] A. Christmann and I. Steinwart, On robust properties of convex risk minimization methods for pattern recognition, J. Mach. Learn. Res., 5 (2004), 1007-1034.   Google Scholar [6] A. Christmann, I. Steinwart and M. Hubert, Robust learning from bites for data mining, Comput. Stat. Data Anal., 52 (2007), 347-361.  doi: 10.1016/j.csda.2006.12.009.  Google Scholar [7] A. Christmann and A. van Messem, Bouligand derivatives and robustness of support vector machines for regression, J. Mach. Learn. Res., 9 (2008), 915-936.   Google Scholar [8] A. Christmann, A. Van Messem and I. Steinwart, On consistency and robustness properties of support vector machines for heavy-tailed distributions, Stat. Interface, 2 (2009), 311-327.  doi: 10.4310/SII.2009.v2.n3.a5.  Google Scholar [9] C. Cortes and V. Vapnik, Support-vector networks, Mach. learn., 20 (1995), 273-297.   Google Scholar [10] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, 2000.   Google Scholar [11] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar [12] Z. Denkowski, S. Mig$\acute{o}$rski and N. S. Papageorgiou, An Introduction to Nonlinear Analysis: Theory, Kluwer Academic/Plenum Publishers, New York, 2003. doi: 10.1007/978-1-4419-9158-4.  Google Scholar [13] F. Dumpert and A. Christmann, Universal consistency and robustness of localized support vector machines, Neurocomputing, 315 (2018), 96-106.   Google Scholar [14] N. Dunford and J. T. Schwartz, Linear Operators, Part I, Interscience Publishers, New York, 1958.  Google Scholar [15] R. Hable and A. Christmann, Robustness versus consistency in ill-posed classification and regression problems, in Classification and Data Mining (eds. A. Giusti, G. Ritter and M. Vichi), Springer, Berlin, (2013), 27–35.  Google Scholar [16] F. R. Hampel, Contributions to the theory of robust estimation, Ph.D thesis, University of California, Berkeley, 1968  Google Scholar [17] Y. Ma and G. Guo, Support Vector Machines Applications, Springer, New York, 2014. Google Scholar [18] B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT press, Cambridge, 2001.   Google Scholar [19] I. Steinwart and A. Christmann, Support Vector Machines, Springer, New York, 2008.  Google Scholar [20] A. Van Messem and A. Christmann, A review on consistency and robustness properties of support vector machines for heavy-tailed distributions, Adv. Data Anal. Classif., 4 (2010), 199-220.  doi: 10.1007/s11634-010-0067-2.  Google Scholar [21] Z. Wu, Compactly supported positive definite radial functions, Adv. Comput. Math., 4 (1995), 283-292.  doi: 10.1007/BF03177517.  Google Scholar
 [1] Ying Lin, Qi Ye. Support vector machine classifiers by non-Euclidean margins. Mathematical Foundations of Computing, 2020, 3 (4) : 279-300. doi: 10.3934/mfc.2020018 [2] Kengo Nakai, Yoshitaka Saiki. Machine-learning construction of a model for a macroscopic fluid variable using the delay-coordinate of a scalar observable. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 1079-1092. doi: 10.3934/dcdss.2020352 [3] Nicholas Geneva, Nicholas Zabaras. Multi-fidelity generative deep learning turbulent flows. Foundations of Data Science, 2020, 2 (4) : 391-428. doi: 10.3934/fods.2020019 [4] Liping Tang, Ying Gao. Some properties of nonconvex oriented distance function and applications to vector optimization problems. Journal of Industrial & Management Optimization, 2021, 17 (1) : 485-500. doi: 10.3934/jimo.2020117 [5] Jianhua Huang, Yanbin Tang, Ming Wang. Singular support of the global attractor for a damped BBM equation. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020345 [6] Héctor Barge. Čech cohomology, homoclinic trajectories and robustness of non-saddle sets. Discrete & Continuous Dynamical Systems - A, 2020  doi: 10.3934/dcds.2020381 [7] Yancong Xu, Lijun Wei, Xiaoyu Jiang, Zirui Zhu. Complex dynamics of a SIRS epidemic model with the influence of hospital bed number. Discrete & Continuous Dynamical Systems - B, 2021  doi: 10.3934/dcdsb.2021016 [8] Min Ji, Xinna Ye, Fangyao Qian, T.C.E. Cheng, Yiwei Jiang. Parallel-machine scheduling in shared manufacturing. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020174 [9] Thomas Bartsch, Tian Xu. Strongly localized semiclassical states for nonlinear Dirac equations. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 29-60. doi: 10.3934/dcds.2020297 [10] Manxue You, Shengjie Li. Perturbation of Image and conjugate duality for vector optimization. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020176 [11] Yifan Chen, Thomas Y. Hou. Function approximation via the subsampled Poincaré inequality. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 169-199. doi: 10.3934/dcds.2020296 [12] Wen Li, Wei-Hui Liu, Seak Weng Vong. Perron vector analysis for irreducible nonnegative tensors and its applications. Journal of Industrial & Management Optimization, 2021, 17 (1) : 29-50. doi: 10.3934/jimo.2019097 [13] Bahaaeldin Abdalla, Thabet Abdeljawad. Oscillation criteria for kernel function dependent fractional dynamic equations. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020443 [14] Shin-Ichiro Ei, Hiroshi Ishii. The motion of weakly interacting localized patterns for reaction-diffusion systems with nonlocal effect. Discrete & Continuous Dynamical Systems - B, 2021, 26 (1) : 173-190. doi: 10.3934/dcdsb.2020329 [15] Yubiao Liu, Chunguo Zhang, Tehuan Chen. Stabilization of 2-d Mindlin-Timoshenko plates with localized acoustic boundary feedback. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2021006 [16] Jianfeng Huang, Haihua Liang. Limit cycles of planar system defined by the sum of two quasi-homogeneous vector fields. Discrete & Continuous Dynamical Systems - B, 2021, 26 (2) : 861-873. doi: 10.3934/dcdsb.2020145 [17] Jingjing Wang, Zaiyun Peng, Zhi Lin, Daqiong Zhou. On the stability of solutions for the generalized vector quasi-equilibrium problems via free-disposal set. Journal of Industrial & Management Optimization, 2021, 17 (2) : 869-887. doi: 10.3934/jimo.2020002 [18] Ali Wehbe, Rayan Nasser, Nahla Noun. Stability of N-D transmission problem in viscoelasticity with localized Kelvin-Voigt damping under different types of geometric conditions. Mathematical Control & Related Fields, 2020  doi: 10.3934/mcrf.2020050 [19] Mohammed Abdulrazaq Kahya, Suhaib Abduljabbar Altamir, Zakariya Yahya Algamal. Improving whale optimization algorithm for feature selection with a time-varying transfer function. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 87-98. doi: 10.3934/naco.2020017 [20] Lingfeng Li, Shousheng Luo, Xue-Cheng Tai, Jiang Yang. A new variational approach based on level-set function for convex hull problem with outliers. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020070

2019 Impact Factor: 1.105