• Previous Article
    Asymptotic expansions and Voronovskaja type theorems for the multivariate neural network operators
  • MFC Home
  • This Issue
  • Next Article
    Orbital stability of periodic traveling wave solutions to the coupled compound KdV and MKdV equations with two components
February  2020, 3(1): 25-40. doi: 10.3934/mfc.2020003

Error analysis on regularized regression based on the Maximum correntropy criterion

School of Mathematical Science, Zhejiang University, Hangzhou, 310027, China

* Corresponding author: Bingzheng Li

Received  December 2019 Published  February 2020

This paper aims at the regularized learning algorithm for regression associated with the correntropy induced losses in reproducing kernel Hilbert spaces. The main target is the error analysis for the regression problem in learning theory based on the maximum correntropy. Explicit learning rates are provided. From our analysis, when choosing a suitable parameter of the loss function, we obtain satisfactory learning rates. The rates depend on the regularization error and the covering numbers of the reproducing kernel Hilbert space.

Citation: Bingzheng Li, Zhengzhan Dai. Error analysis on regularized regression based on the Maximum correntropy criterion. Mathematical Foundations of Computing, 2020, 3 (1) : 25-40. doi: 10.3934/mfc.2020003
References:
[1]

R. J. BessaV. Miranda and J. Gama, Entropy and correntropy against minimum square in offline and online three-day ahead wing power forecasting, Power Systems, IEEE Transactions on, 24 (2009), 1657-1666.   Google Scholar

[2] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar
[3]

J. FanT. HuQ. Wu and D. X. Zhou, Consistency analysis of an empirical minimum error entropy algorithm,, Applied and Computational Harmonic Analysis, 41 (2016), 164-189.  doi: 10.1016/j.acha.2014.12.005.  Google Scholar

[4]

Y. L. Feng, J. Fang and J. A. K. Suykens, A statistical learning approach to modal regression, Journal of Machine Learning Research, 2020. Google Scholar

[5]

Y. L. FengX. L. HuangL. Shi and J. A. K. Suykens, Learning with the maximum correntropy criterion induced losses for regression, Journal of Machine Learning Research, 16 (2015), 993-1034.   Google Scholar

[6]

Y. L. Feng and Y. M. Ying, Learning with correntropy-induced losses for regression with mixture of symmetric stable noise, Applied and Computational Harmonic Analysis, 48 (2020), 795-810.  doi: 10.1016/j.acha.2019.09.001.  Google Scholar

[7]

R. HeW. S. Zheng and B. G. Hu, Maximum correntropy criterion for robust face recognition, IEEE Transactions on pattern Analysis and Machine Intelligence, 33 (2011), 1561-1576.   Google Scholar

[8]

R. HeW. S. ZhengB. G. Hu and X. W. Kong, A regularized correntropy framework for robust pattern recognition., Neural Computation, 23 (2011), 2074-2100.  doi: 10.1162/NECO_a_00155.  Google Scholar

[9]

P. J. Huber, Robust Statistics, John Wiley & Sons, 1981.  Google Scholar

[10]

T. HuJ. FanQ. Wu and D. X. Zhou, Learning theory approach to minimum error entropy criterion, Journal of Machine Learning Research, 14 (2013), 377-397.   Google Scholar

[11]

T. HuJ. FanQ. Wu and D. X. Zhou, Regularization schemes for minimum error entropy principle, Analysis and Applications, 13 (2015), 437-455.  doi: 10.1142/S0219530514500110.  Google Scholar

[12]

B. Z. Li, Approximation by multivariate Bernstein-Durrmeyer operators and learning rates of least-square regularized regression with multivariate polynomial kernels, J. Approx. Theory, 173 (2013), 33-55.  doi: 10.1016/j.jat.2013.04.007.  Google Scholar

[13]

W. LiuP. P. Pokharel and J. C. Príncipe, Correntropy: Properties and application in non-gaussian signal processing, IEEE Transactions on Signal Processing, 55 (2007), 5286-5298.  doi: 10.1109/TSP.2007.896065.  Google Scholar

[14]

F. S. Lv and J. Fan, Optimal Learning with Gaussians and Correntropy Loss, Analysis and Applications, 2020. doi: 10.1142/S0219530519410124.  Google Scholar

[15]

I. SantamaríaP. P. Pokharel and J. C. Príncipe, Generalized correlation function: Definition, properties, and application to blind equalization, IEEE Transactions on Signal Processing, 54 (2006), 2187-2197.   Google Scholar

[16]

S. Smale and D. X. Zhou, Estimating the approximation error in learning theory., Analysis and Applications, 1 (2003), 17-41.  doi: 10.1142/S0219530503000089.  Google Scholar

[17]

I. Steinwart and C. Scovel, Fast rates for support vector machines, Lecture Notes in Computer Science, 3559 (2005), 279-294.  doi: 10.1007/11503415_19.  Google Scholar

[18]

V. Vapnik, Statistical Learning Theory, John Wiley & Sons, 1998.  Google Scholar

[19]

E. D. VitoL. RosascoA. Caponnetto and U. D. Giovannini, Learning from Examples as an Inverse Problem, Journal of Machine Learning Research, 6 (2005), 883-904.   Google Scholar

[20]

C. Wang and D. X. Zhou, Optimal learning rates for least squares regularized regression with unbounded sampling, Journal of Complexity, 27 (2011), 55-67.  doi: 10.1016/j.jco.2010.10.002.  Google Scholar

[21]

Q. WuY. Ying and D. X. Zhou, Learning rates of least-square regularized regression, Foundations of Computation Mathematics, 6 (2006), 171-192.  doi: 10.1007/s10208-004-0155-9.  Google Scholar

[22]

Q. WuY. Ying and D. X. Zhou, Multi-kernel regularized classifiers, Journal of Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.  Google Scholar

[23]

Y. M. Ying and D. X. Zhou, Learnability of Gaussians with flexible variances, Journal of Machine Learning Research, 8 (2007), 249-276.   Google Scholar

[24]

D. X. Zhou, The covering number in learning theory, Journal of Complexity, 18 (2002), 739-767.  doi: 10.1006/jcom.2002.0635.  Google Scholar

[25]

D. X. Zhou, Capacity of reproducing kernel spaces in learning theory, IEEE Transactions on Information Theory, 49 (2003), 1743-1752.  doi: 10.1109/TIT.2003.813564.  Google Scholar

show all references

References:
[1]

R. J. BessaV. Miranda and J. Gama, Entropy and correntropy against minimum square in offline and online three-day ahead wing power forecasting, Power Systems, IEEE Transactions on, 24 (2009), 1657-1666.   Google Scholar

[2] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar
[3]

J. FanT. HuQ. Wu and D. X. Zhou, Consistency analysis of an empirical minimum error entropy algorithm,, Applied and Computational Harmonic Analysis, 41 (2016), 164-189.  doi: 10.1016/j.acha.2014.12.005.  Google Scholar

[4]

Y. L. Feng, J. Fang and J. A. K. Suykens, A statistical learning approach to modal regression, Journal of Machine Learning Research, 2020. Google Scholar

[5]

Y. L. FengX. L. HuangL. Shi and J. A. K. Suykens, Learning with the maximum correntropy criterion induced losses for regression, Journal of Machine Learning Research, 16 (2015), 993-1034.   Google Scholar

[6]

Y. L. Feng and Y. M. Ying, Learning with correntropy-induced losses for regression with mixture of symmetric stable noise, Applied and Computational Harmonic Analysis, 48 (2020), 795-810.  doi: 10.1016/j.acha.2019.09.001.  Google Scholar

[7]

R. HeW. S. Zheng and B. G. Hu, Maximum correntropy criterion for robust face recognition, IEEE Transactions on pattern Analysis and Machine Intelligence, 33 (2011), 1561-1576.   Google Scholar

[8]

R. HeW. S. ZhengB. G. Hu and X. W. Kong, A regularized correntropy framework for robust pattern recognition., Neural Computation, 23 (2011), 2074-2100.  doi: 10.1162/NECO_a_00155.  Google Scholar

[9]

P. J. Huber, Robust Statistics, John Wiley & Sons, 1981.  Google Scholar

[10]

T. HuJ. FanQ. Wu and D. X. Zhou, Learning theory approach to minimum error entropy criterion, Journal of Machine Learning Research, 14 (2013), 377-397.   Google Scholar

[11]

T. HuJ. FanQ. Wu and D. X. Zhou, Regularization schemes for minimum error entropy principle, Analysis and Applications, 13 (2015), 437-455.  doi: 10.1142/S0219530514500110.  Google Scholar

[12]

B. Z. Li, Approximation by multivariate Bernstein-Durrmeyer operators and learning rates of least-square regularized regression with multivariate polynomial kernels, J. Approx. Theory, 173 (2013), 33-55.  doi: 10.1016/j.jat.2013.04.007.  Google Scholar

[13]

W. LiuP. P. Pokharel and J. C. Príncipe, Correntropy: Properties and application in non-gaussian signal processing, IEEE Transactions on Signal Processing, 55 (2007), 5286-5298.  doi: 10.1109/TSP.2007.896065.  Google Scholar

[14]

F. S. Lv and J. Fan, Optimal Learning with Gaussians and Correntropy Loss, Analysis and Applications, 2020. doi: 10.1142/S0219530519410124.  Google Scholar

[15]

I. SantamaríaP. P. Pokharel and J. C. Príncipe, Generalized correlation function: Definition, properties, and application to blind equalization, IEEE Transactions on Signal Processing, 54 (2006), 2187-2197.   Google Scholar

[16]

S. Smale and D. X. Zhou, Estimating the approximation error in learning theory., Analysis and Applications, 1 (2003), 17-41.  doi: 10.1142/S0219530503000089.  Google Scholar

[17]

I. Steinwart and C. Scovel, Fast rates for support vector machines, Lecture Notes in Computer Science, 3559 (2005), 279-294.  doi: 10.1007/11503415_19.  Google Scholar

[18]

V. Vapnik, Statistical Learning Theory, John Wiley & Sons, 1998.  Google Scholar

[19]

E. D. VitoL. RosascoA. Caponnetto and U. D. Giovannini, Learning from Examples as an Inverse Problem, Journal of Machine Learning Research, 6 (2005), 883-904.   Google Scholar

[20]

C. Wang and D. X. Zhou, Optimal learning rates for least squares regularized regression with unbounded sampling, Journal of Complexity, 27 (2011), 55-67.  doi: 10.1016/j.jco.2010.10.002.  Google Scholar

[21]

Q. WuY. Ying and D. X. Zhou, Learning rates of least-square regularized regression, Foundations of Computation Mathematics, 6 (2006), 171-192.  doi: 10.1007/s10208-004-0155-9.  Google Scholar

[22]

Q. WuY. Ying and D. X. Zhou, Multi-kernel regularized classifiers, Journal of Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.  Google Scholar

[23]

Y. M. Ying and D. X. Zhou, Learnability of Gaussians with flexible variances, Journal of Machine Learning Research, 8 (2007), 249-276.   Google Scholar

[24]

D. X. Zhou, The covering number in learning theory, Journal of Complexity, 18 (2002), 739-767.  doi: 10.1006/jcom.2002.0635.  Google Scholar

[25]

D. X. Zhou, Capacity of reproducing kernel spaces in learning theory, IEEE Transactions on Information Theory, 49 (2003), 1743-1752.  doi: 10.1109/TIT.2003.813564.  Google Scholar

[1]

Ting Hu. Kernel-based maximum correntropy criterion with gradient descent method. Communications on Pure & Applied Analysis, 2020, 19 (8) : 4159-4177. doi: 10.3934/cpaa.2020186

[2]

Bernd Hofmann, Barbara Kaltenbacher, Elena Resmerita. Lavrentiev's regularization method in Hilbert spaces revisited. Inverse Problems & Imaging, 2016, 10 (3) : 741-764. doi: 10.3934/ipi.2016019

[3]

Kaitlyn (Voccola) Muller. A reproducing kernel Hilbert space framework for inverse scattering problems within the Born approximation. Inverse Problems & Imaging, 2019, 13 (6) : 1327-1348. doi: 10.3934/ipi.2019058

[4]

Ali Akgül, Mustafa Inc, Esra Karatas. Reproducing kernel functions for difference equations. Discrete & Continuous Dynamical Systems - S, 2015, 8 (6) : 1055-1064. doi: 10.3934/dcdss.2015.8.1055

[5]

Irene Benedetti, Luisa Malaguti, Valentina Taddei. Nonlocal problems in Hilbert spaces. Conference Publications, 2015, 2015 (special) : 103-111. doi: 10.3934/proc.2015.0103

[6]

Fritz Gesztesy, Rudi Weikard, Maxim Zinchenko. On a class of model Hilbert spaces. Discrete & Continuous Dynamical Systems - A, 2013, 33 (11&12) : 5067-5088. doi: 10.3934/dcds.2013.33.5067

[7]

Andrew Klapper, Andrew Mertz. The two covering radius of the two error correcting BCH code. Advances in Mathematics of Communications, 2009, 3 (1) : 83-95. doi: 10.3934/amc.2009.3.83

[8]

Baohuai Sheng, Huanxiang Liu, Huimin Wang. Learning rates for the kernel regularized regression with a differentiable strongly convex loss. Communications on Pure & Applied Analysis, 2020, 19 (8) : 3973-4005. doi: 10.3934/cpaa.2020176

[9]

Alexander A. Davydov, Massimo Giulietti, Stefano Marcugini, Fernanda Pambianco. Linear nonbinary covering codes and saturating sets in projective spaces. Advances in Mathematics of Communications, 2011, 5 (1) : 119-147. doi: 10.3934/amc.2011.5.119

[10]

Matthew A. Fury. Regularization for ill-posed inhomogeneous evolution problems in a Hilbert space. Conference Publications, 2013, 2013 (special) : 259-272. doi: 10.3934/proc.2013.2013.259

[11]

G. Calafiore, M.C. Campi. A learning theory approach to the construction of predictor models. Conference Publications, 2003, 2003 (Special) : 156-166. doi: 10.3934/proc.2003.2003.156

[12]

Jin-Mun Jeong, Seong-Ho Cho. Identification problems of retarded differential systems in Hilbert spaces. Evolution Equations & Control Theory, 2017, 6 (1) : 77-91. doi: 10.3934/eect.2017005

[13]

Giuseppe Da Prato, Franco Flandoli. Some results for pathwise uniqueness in Hilbert spaces. Communications on Pure & Applied Analysis, 2014, 13 (5) : 1789-1797. doi: 10.3934/cpaa.2014.13.1789

[14]

Guangcun Lu. The splitting lemmas for nonsmooth functionals on Hilbert spaces I. Discrete & Continuous Dynamical Systems - A, 2013, 33 (7) : 2939-2990. doi: 10.3934/dcds.2013.33.2939

[15]

Michael Hochman. Lectures on dynamics, fractal geometry, and metric number theory. Journal of Modern Dynamics, 2014, 8 (3&4) : 437-497. doi: 10.3934/jmd.2014.8.437

[16]

E. Muñoz Garcia, R. Pérez-Marco. Diophantine conditions in small divisors and transcendental number theory. Discrete & Continuous Dynamical Systems - A, 2003, 9 (6) : 1401-1409. doi: 10.3934/dcds.2003.9.1401

[17]

Nina Lebedeva. Number of extremal subsets in Alexandrov spaces and rigidity. Electronic Research Announcements, 2014, 21: 120-125. doi: 10.3934/era.2014.21.120

[18]

Carla Mascia, Giancarlo Rinaldo, Massimiliano Sala. Hilbert quasi-polynomial for order domains and application to coding theory. Advances in Mathematics of Communications, 2018, 12 (2) : 287-301. doi: 10.3934/amc.2018018

[19]

Simone Creo, Maria Rosaria Lancia, Alejandro Vélez-Santiago, Paola Vernole. Approximation of a nonlinear fractal energy functional on varying Hilbert spaces. Communications on Pure & Applied Analysis, 2018, 17 (2) : 647-669. doi: 10.3934/cpaa.2018035

[20]

Pengyu Chen, Yongxiang Li, Xuping Zhang. On the initial value problem of fractional stochastic evolution equations in Hilbert spaces. Communications on Pure & Applied Analysis, 2015, 14 (5) : 1817-1840. doi: 10.3934/cpaa.2015.14.1817

 Impact Factor: 

Article outline

[Back to Top]