August  2020, 19(8): 3947-3956. doi: 10.3934/cpaa.2020174

Quantitative robustness of localized support vector machines

Federal Statistical Office of Germany, Gustav-Stresemann-Ring 11, 65189 Wiesbaden, and, University of Bayreuth, Department of Mathematics, 95440 Bayreuth, Germany

Received  March 2019 Revised  July 2019 Published  May 2020

Fund Project: The work was partially supported by grant CH 291/3-1 of the Deutsche Forschungsgemeinschaft (DFG)

The huge amount of available data nowadays is a challenge for kernel-based machine learning algorithms like SVMs with respect to runtime and storage capacities. Local approaches might help to relieve these issues and to improve statistical accuracy. It has already been shown that these local approaches are consistent and robust in a basic sense. This article refines the analysis of robustness properties towards the so-called influence function which expresses the differentiability of the learning method: We show that there is a differentiable dependency of our locally learned predictor on the underlying distribution. The assumptions of the proven theorems can be verified without knowing anything about this distribution. This makes the results interesting also from an applied point of view.

Citation: Florian Dumpert. Quantitative robustness of localized support vector machines. Communications on Pure & Applied Analysis, 2020, 19 (8) : 3947-3956. doi: 10.3934/cpaa.2020174
References:
[1]

N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337-404.  doi: 10.2307/1990404.  Google Scholar

[2] R. B. Ash and C. Doleans-Dade, Probability and Measure Theory, Academic Press, an Diego, 2000.   Google Scholar
[3]

A. Berlinet and C. Thomas-Agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics, Springer, New York, 2001. doi: 10.1007/978-1-4419-9096-9.  Google Scholar

[4]

B. E. Boser, I. M. Guyon and V. N. Vapnik, A training algorithm for optimal margin classifiers, in Proceedings of The Fifth Annual Workshop on Computational Learning Theory, (1992), 144–152. Google Scholar

[5]

A. Christmann and I. Steinwart, On robust properties of convex risk minimization methods for pattern recognition, J. Mach. Learn. Res., 5 (2004), 1007-1034.   Google Scholar

[6]

A. ChristmannI. Steinwart and M. Hubert, Robust learning from bites for data mining, Comput. Stat. Data Anal., 52 (2007), 347-361.  doi: 10.1016/j.csda.2006.12.009.  Google Scholar

[7]

A. Christmann and A. van Messem, Bouligand derivatives and robustness of support vector machines for regression, J. Mach. Learn. Res., 9 (2008), 915-936.   Google Scholar

[8]

A. ChristmannA. Van Messem and I. Steinwart, On consistency and robustness properties of support vector machines for heavy-tailed distributions, Stat. Interface, 2 (2009), 311-327.  doi: 10.4310/SII.2009.v2.n3.a5.  Google Scholar

[9]

C. Cortes and V. Vapnik, Support-vector networks, Mach. learn., 20 (1995), 273-297.   Google Scholar

[10] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, 2000.   Google Scholar
[11] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar
[12]

Z. Denkowski, S. Mig$\acute{o}$rski and N. S. Papageorgiou, An Introduction to Nonlinear Analysis: Theory, Kluwer Academic/Plenum Publishers, New York, 2003. doi: 10.1007/978-1-4419-9158-4.  Google Scholar

[13]

F. Dumpert and A. Christmann, Universal consistency and robustness of localized support vector machines, Neurocomputing, 315 (2018), 96-106.   Google Scholar

[14]

N. Dunford and J. T. Schwartz, Linear Operators, Part I, Interscience Publishers, New York, 1958.  Google Scholar

[15]

R. Hable and A. Christmann, Robustness versus consistency in ill-posed classification and regression problems, in Classification and Data Mining (eds. A. Giusti, G. Ritter and M. Vichi), Springer, Berlin, (2013), 27–35.  Google Scholar

[16]

F. R. Hampel, Contributions to the theory of robust estimation, Ph.D thesis, University of California, Berkeley, 1968  Google Scholar

[17]

Y. Ma and G. Guo, Support Vector Machines Applications, Springer, New York, 2014. Google Scholar

[18] B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT press, Cambridge, 2001.   Google Scholar
[19]

I. Steinwart and A. Christmann, Support Vector Machines, Springer, New York, 2008.  Google Scholar

[20]

A. Van Messem and A. Christmann, A review on consistency and robustness properties of support vector machines for heavy-tailed distributions, Adv. Data Anal. Classif., 4 (2010), 199-220.  doi: 10.1007/s11634-010-0067-2.  Google Scholar

[21]

Z. Wu, Compactly supported positive definite radial functions, Adv. Comput. Math., 4 (1995), 283-292.  doi: 10.1007/BF03177517.  Google Scholar

show all references

References:
[1]

N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337-404.  doi: 10.2307/1990404.  Google Scholar

[2] R. B. Ash and C. Doleans-Dade, Probability and Measure Theory, Academic Press, an Diego, 2000.   Google Scholar
[3]

A. Berlinet and C. Thomas-Agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics, Springer, New York, 2001. doi: 10.1007/978-1-4419-9096-9.  Google Scholar

[4]

B. E. Boser, I. M. Guyon and V. N. Vapnik, A training algorithm for optimal margin classifiers, in Proceedings of The Fifth Annual Workshop on Computational Learning Theory, (1992), 144–152. Google Scholar

[5]

A. Christmann and I. Steinwart, On robust properties of convex risk minimization methods for pattern recognition, J. Mach. Learn. Res., 5 (2004), 1007-1034.   Google Scholar

[6]

A. ChristmannI. Steinwart and M. Hubert, Robust learning from bites for data mining, Comput. Stat. Data Anal., 52 (2007), 347-361.  doi: 10.1016/j.csda.2006.12.009.  Google Scholar

[7]

A. Christmann and A. van Messem, Bouligand derivatives and robustness of support vector machines for regression, J. Mach. Learn. Res., 9 (2008), 915-936.   Google Scholar

[8]

A. ChristmannA. Van Messem and I. Steinwart, On consistency and robustness properties of support vector machines for heavy-tailed distributions, Stat. Interface, 2 (2009), 311-327.  doi: 10.4310/SII.2009.v2.n3.a5.  Google Scholar

[9]

C. Cortes and V. Vapnik, Support-vector networks, Mach. learn., 20 (1995), 273-297.   Google Scholar

[10] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, 2000.   Google Scholar
[11] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar
[12]

Z. Denkowski, S. Mig$\acute{o}$rski and N. S. Papageorgiou, An Introduction to Nonlinear Analysis: Theory, Kluwer Academic/Plenum Publishers, New York, 2003. doi: 10.1007/978-1-4419-9158-4.  Google Scholar

[13]

F. Dumpert and A. Christmann, Universal consistency and robustness of localized support vector machines, Neurocomputing, 315 (2018), 96-106.   Google Scholar

[14]

N. Dunford and J. T. Schwartz, Linear Operators, Part I, Interscience Publishers, New York, 1958.  Google Scholar

[15]

R. Hable and A. Christmann, Robustness versus consistency in ill-posed classification and regression problems, in Classification and Data Mining (eds. A. Giusti, G. Ritter and M. Vichi), Springer, Berlin, (2013), 27–35.  Google Scholar

[16]

F. R. Hampel, Contributions to the theory of robust estimation, Ph.D thesis, University of California, Berkeley, 1968  Google Scholar

[17]

Y. Ma and G. Guo, Support Vector Machines Applications, Springer, New York, 2014. Google Scholar

[18] B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT press, Cambridge, 2001.   Google Scholar
[19]

I. Steinwart and A. Christmann, Support Vector Machines, Springer, New York, 2008.  Google Scholar

[20]

A. Van Messem and A. Christmann, A review on consistency and robustness properties of support vector machines for heavy-tailed distributions, Adv. Data Anal. Classif., 4 (2010), 199-220.  doi: 10.1007/s11634-010-0067-2.  Google Scholar

[21]

Z. Wu, Compactly supported positive definite radial functions, Adv. Comput. Math., 4 (1995), 283-292.  doi: 10.1007/BF03177517.  Google Scholar

[1]

Yubo Yuan, Weiguo Fan, Dongmei Pu. Spline function smooth support vector machine for classification. Journal of Industrial & Management Optimization, 2007, 3 (3) : 529-542. doi: 10.3934/jimo.2007.3.529

[2]

K. Schittkowski. Optimal parameter selection in support vector machines. Journal of Industrial & Management Optimization, 2005, 1 (4) : 465-476. doi: 10.3934/jimo.2005.1.465

[3]

Hong-Gunn Chew, Cheng-Chew Lim. On regularisation parameter transformation of support vector machines. Journal of Industrial & Management Optimization, 2009, 5 (2) : 403-415. doi: 10.3934/jimo.2009.5.403

[4]

Émilie Chouzenoux, Henri Gérard, Jean-Christophe Pesquet. General risk measures for robust machine learning. Foundations of Data Science, 2019, 1 (3) : 249-269. doi: 10.3934/fods.2019011

[5]

Yubo Yuan. Canonical duality solution for alternating support vector machine. Journal of Industrial & Management Optimization, 2012, 8 (3) : 611-621. doi: 10.3934/jimo.2012.8.611

[6]

Keiji Tatsumi, Masashi Akao, Ryo Kawachi, Tetsuzo Tanino. Performance evaluation of multiobjective multiclass support vector machines maximizing geometric margins. Numerical Algebra, Control & Optimization, 2011, 1 (1) : 151-169. doi: 10.3934/naco.2011.1.151

[7]

Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031

[8]

Mingbao Cheng, Shuxian Xiao, Guosheng Liu. Single-machine rescheduling problems with learning effect under disruptions. Journal of Industrial & Management Optimization, 2018, 14 (3) : 967-980. doi: 10.3934/jimo.2017085

[9]

Andreas Chirstmann, Qiang Wu, Ding-Xuan Zhou. Preface to the special issue on analysis in machine learning and data science. Communications on Pure & Applied Analysis, 2020, 19 (8) : i-iii. doi: 10.3934/cpaa.2020171

[10]

Ying Lin, Qi Ye. Support vector machine classifiers by non-Euclidean margins. Mathematical Foundations of Computing, 2020  doi: 10.3934/mfc.2020018

[11]

Jian Luo, Shu-Cherng Fang, Yanqin Bai, Zhibin Deng. Fuzzy quadratic surface support vector machine based on fisher discriminant analysis. Journal of Industrial & Management Optimization, 2016, 12 (1) : 357-373. doi: 10.3934/jimo.2016.12.357

[12]

Xin Li, Ziguan Cui, Linhui Sun, Guanming Lu, Debnath Narayan. Research on iterative repair algorithm of Hyperchaotic image based on support vector machine. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1199-1218. doi: 10.3934/dcdss.2019083

[13]

Armin Eftekhari, Michael B. Wakin, Ping Li, Paul G. Constantine. Randomized learning of the second-moment matrix of a smooth function. Foundations of Data Science, 2019, 1 (3) : 329-387. doi: 10.3934/fods.2019015

[14]

Fengqiu Liu, Xiaoping Xue. Subgradient-based neural network for nonconvex optimization problems in support vector machines with indefinite kernels. Journal of Industrial & Management Optimization, 2016, 12 (1) : 285-301. doi: 10.3934/jimo.2016.12.285

[15]

Yanfei Lu, Qingfei Yin, Hongyi Li, Hongli Sun, Yunlei Yang, Muzhou Hou. Solving higher order nonlinear ordinary differential equations with least squares support vector machines. Journal of Industrial & Management Optimization, 2020, 16 (3) : 1481-1502. doi: 10.3934/jimo.2019012

[16]

Ping Yan, Ji-Bo Wang, Li-Qiang Zhao. Single-machine bi-criterion scheduling with release times and exponentially time-dependent learning effects. Journal of Industrial & Management Optimization, 2019, 15 (3) : 1117-1131. doi: 10.3934/jimo.2018088

[17]

Cai-Tong Yue, Jing Liang, Bo-Fei Lang, Bo-Yang Qu. Two-hidden-layer extreme learning machine based wrist vein recognition system. Big Data & Information Analytics, 2017, 2 (1) : 59-68. doi: 10.3934/bdia.2017008

[18]

Xingong Zhang. Single machine and flowshop scheduling problems with sum-of-processing time based learning phenomenon. Journal of Industrial & Management Optimization, 2020, 16 (1) : 231-244. doi: 10.3934/jimo.2018148

[19]

Kengo Nakai, Yoshitaka Saiki. Machine-learning construction of a model for a macroscopic fluid variable using the delay-coordinate of a scalar observable. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020352

[20]

Marc Bocquet, Julien Brajard, Alberto Carrassi, Laurent Bertino. Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization. Foundations of Data Science, 2020, 2 (1) : 55-80. doi: 10.3934/fods.2020004

2019 Impact Factor: 1.105

Metrics

  • PDF downloads (57)
  • HTML views (175)
  • Cited by (0)

Other articles
by authors

[Back to Top]