August  2020, 19(8): 4023-4054. doi: 10.3934/cpaa.2020178

Convergence of online pairwise regression learning with quadratic loss

1. 

School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, 310018, China

2. 

Department of Applied Statistics, Shaoxing University, Shaoxing, 312000, China

* Corresponding author

Received  July 2019 Revised  November 2019 Published  May 2020

Fund Project: This work is supported by the National Natural Science Foundation of China under Grants Nos. 61877039, 11971432 and the Humanities and Social Sciences Research Project of Ministry of Education (18YJA910001)

Recent investigations on the error analysis of kernel regularized pairwise learning initiate the theoretical research on pairwise reproducing kernel Hilbert spaces (PRKHSs). In the present paper, we provide a method of constructing PRKHSs with classical Jacobi orthogonal polynomials. The performance of the kernel regularized online pairwise regression learning algorithms based on a quadratic loss function is investigated. Applying convex analysis and Rademacher complexity techniques, the bounds for the generalization error are provided explicitly. It is shown that the convergence rate can be greatly improved by adjusting the scale parameters in the loss function.

Citation: Shuhua Wang, Zhenlong Chen, Baohuai Sheng. Convergence of online pairwise regression learning with quadratic loss. Communications on Pure & Applied Analysis, 2020, 19 (8) : 4023-4054. doi: 10.3934/cpaa.2020178
References:
[1]

S. Agarwal and P. Niyogi, Generalization bounds for ranking algorithms via algorithms via algorithmic stability, J. Mach. Learn. Res., 10 (2009), 441-474.   Google Scholar

[2]

R. AktassA. Altin and F. Tasdelen, A note on a family of two-variable polynomials, J. Comput. Appl. Math., 235 (2011), 4825-4833.  doi: 10.1016/j.cam.2010.11.005.  Google Scholar

[3]

A. AltinR. Aktas and E. Erkus-Duman, On a multivariable extension for the extended Jacobi polynomials, J. Math. Anal. Appl., 353 (2009), 121-133.  doi: 10.1016/j.jmaa.2008.11.070.  Google Scholar

[4]

P. L. BartlettO. Bousquet and S. Mendelson, Local rademacher complexities, Ann. Stat., 33 (2005), 1497-1537.  doi: 10.1214/009053605000000282.  Google Scholar

[5]

P. L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res., 3 (2002), 463-482.  doi: 10.1162/153244303321897690.  Google Scholar

[6]

H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer-Verlag, New York, 2010. doi: 10.1007/978-1-4419-9467-7.  Google Scholar

[7]

A. Bezubik, J. Hrivn$\acute{a}$k and S. Po$\check{s}$ta, Two dimensional symmetric and antisymmetric exponential functions and orthogonal polynomials, J. Phys. Confer. Ser., 411 (2013), Art. 012007. doi: 10.1088/1742-6596/411/1/012007.  Google Scholar

[8]

M. Boissier, S. Lyu, Y. Ying and D. X. Zhou, Fast convergence of online pairwise learning algorithms, in Proceedings of the 19th International Conference on Artifical Intelligence and Statistics, (2016) 204–212. Google Scholar

[9] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge, Cambridge University Press, 2004.  doi: 10.1017/CBO9780511804441.  Google Scholar
[10]

C. BrunnerA. FischerK. Luig and T. Thies, Pairwise support vector machines and their application to large scale problems, J. Mach. Learn. Res., 13 (2012), 2279-2292.   Google Scholar

[11]

Q. CaoZ. C. Guo and Y. M. Ying, Generalization bounds for metric and similarity learning, Mach. Learn., 102 (2016), 115-132.  doi: 10.1007/s10994-015-5499-7.  Google Scholar

[12]

H. ChenZ. B. Pan and L. Q. Li, Learning performance of coefficient-based regularized ranking, Neurocomputing, 133 (2014), 54-62.   Google Scholar

[13]

H. Chen and J. T. Wu, Regularized ranking with convex losses and $\ell^{1}$ penalty, Abst. Appl. Anal., (2013), Art. 927827, 8 pp. doi: 10.1155/2013/927827.  Google Scholar

[14]

R. H. Chen and Z. M. Wu, Solving partial differential equation by using multiquadratic quasi-interpolation, Appl. Math. Comput., 186 (2007), 1502-1510.  doi: 10.1016/j.amc.2006.07.160.  Google Scholar

[15]

X. M. Chen and Y. W. Lei, Refined bounds for online pairwise learning algorithms, Neurocomputing, 275 (2018), 2656-2665.   Google Scholar

[16]

S. O. Cheng and A. J. Smola, Hyperkernels, Conference Paper, 2002. Available from: https://www.researchgate.net/publication/221618045. Google Scholar

[17]

S. O. ChengA. J. Smola and R. C. Williamson, Learning the kernel with hyperkernels, J. Mach. Learn. Res., 6 (2005), 1043-1071.   Google Scholar

[18]

A. Christmann and D. X. Zhou, On the robustness of regularized pairwise learning methods based on kernels, J. Complexity, 37 (2016), 1-33.  doi: 10.1016/j.jco.2016.07.001.  Google Scholar

[19]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc., 39 (2002), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.  Google Scholar

[20] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, New York, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar
[21]

S. R. DasKu hooD. Mishra and M. Rout, An optimized feature reduction based currency forecasting model exploring the online sequential extreme learning machine and krill herd strategies, Physica A, 513 (2019), 339-370.   Google Scholar

[22] C. F. Dunkl and Y. Xu, Orthogonal Polynomials of Several Variables, Cambridge University Press, 2014.  doi: 10.1017/CBO9781107786134.  Google Scholar
[23]

Q. FangM. Xu and Y. M. Ying, Faster convergence of a randomized coordinate descent method for linearly constrained optimization problems, Anal. Appl., 16 (2018), 741-755.  doi: 10.1142/S0219530518500082.  Google Scholar

[24]

L. Fern$\acute{a}$ndezT. E. P$\acute{e}$rez and M. A. Pi$\tilde{n}$ar, On Koornwinder classical orthogonal polynomials in two variables, J. Comput. Appl. Math., 236 (2012), 3817-3826.  doi: 10.1016/j.cam.2011.08.017.  Google Scholar

[25]

W. W. Gao and Z. G. Wang, A meshless scheme for partial differential equations based on multiquadratic trigonometric b-spline quasi-interpolation, Chin. Phys. B, 23 (2014), Art. 110207. Google Scholar

[26]

Z. C. GuoY. M. Ying and D. X. Zhou, Online regularized learning with pairwise loss functions, Adv. Comput. Math., 43 (2017), 127-150.  doi: 10.1007/s10444-016-9479-7.  Google Scholar

[27]

T. HuJ. Fan and D. H. Xiang, Convergence analysis of distributed multi-penalty regularized pairwise learning, Anal. Appl., 18 (2020), 109-127.  doi: 10.1142/S0219530519410045.  Google Scholar

[28]

S. Karlin and J. Gregor, Determinants of orthogonal polynomials, Bull. Amer. Math. Soc., 68 (1962), 204-209.  doi: 10.1090/S0002-9904-1962-10749-6.  Google Scholar

[29]

J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Ann. Math. Stati., 23 (1952), 462-466.  doi: 10.1214/aoms/1177729392.  Google Scholar

[30]

A. Klimyk and J. Patera, (Anti)Symmetric multivariable exponential functions and corresponding Fourier transforms, J. Phys. A Math. Theor., 40 (2007), 10473-10489.  doi: 10.1088/1751-8113/40/34/006.  Google Scholar

[31]

T. Koornwinder, Two-variable analogues of the classical orthogonal polynomials, in Theory and Application of Special Functions (eds. R. A. Askey), Academic Press, New York, (1975), 435–495.  Google Scholar

[32]

T. Koornwinder, Yet another proof of the addition formula for Jacobi polynomials, J. Math. Anal. Appl., 61 (1977), 136-141.  doi: 10.1016/0022-247X(77)90149-4.  Google Scholar

[33]

T. Koornwinder, Orthogonal polynomials in two variables which are eigenfunctions of two algebraically independeng partial differential operators, I, II, Indag Math., 36 (1974), 48-66.   Google Scholar

[34]

H. L. Krall and I. M. Sheffer, Orthogonal polynomials in two variables, Ann. Mat. Pura Appl., 76 (1967), 325-376.  doi: 10.1007/BF02412238.  Google Scholar

[35]

G. KriukovaS. V. Pereverzyev and P. Tkachenko, On the convergence rate and some applications of regularized ranking algorithms, J. Complexity, 33 (2016), 14-29.  doi: 10.1016/j.jco.2015.09.004.  Google Scholar

[36]

D. W. Lee, Partial differential equations for products of two classical orthogonal polynomials, Bull. Korean Math. Soc., 42 (2005), 179-188.  doi: 10.4134/BKMS.2005.42.1.179.  Google Scholar

[37]

S. B. LinJ. S. ZengJ. Fang and Z. B. Xu, Learning rates of $l^q$ coefficient regularization learning with Gaussian kernel, Neural Comput., 26 (2014), 2359-2378.  doi: 10.1162/NECO_a_00641.  Google Scholar

[38]

J. H. Lin and D. X. Zhou, Online learning algorithms can converge comparably fast as batch learning, IEEE Trans. Neural Netw. Learn. Syst., 29 (2017), 2367-2378.  doi: 10.1109/TNNLS.2017.2677970.  Google Scholar

[39]

J. H. LinY. W. LeiB. Zhang and D. X. Zhou, Online pairwise learning algorithms with convex loss functions, Inform. Sci., 406-407 (2017), 57-70.  doi: 10.1162/neco_a_00817.  Google Scholar

[40]

H. X. LiuB. H. Sheng and P. X. Ye, The improved learning rate for regularized regression with RKBSs, Int. J. Math. Learn. Cyber., 8 (2017), 1235-1245.   Google Scholar

[41]

P. Malave, Multi-dimensional Orthogonal Polynomials, Ph.D thesis, Marathwada, Univ. Aurangabad, 1979. Google Scholar

[42] L. MasonJ. BaxterP. Bartlett and M. Frean, Functional Gradient Techniques for Combining Hypotheses, MIT Press, 1999.   Google Scholar
[43]

R. Meir and T. Zhang, Generalization error bounds for Bayesian mixture algorithms, J. Mach. Learn. Res., 4 (2003), 839-860.  doi: 10.1162/1532443041424300.  Google Scholar

[44]

G. V. Milovanovi$\acute{c}$G. $\ddot{O}$zt$\ddot{u}$rk and R. Aktas, Properties of some of two-variable orthogonal polynomials, Bull. Malay. Math. Sci. Soc., 43 (2020), 1403-1431.  doi: 10.1007/s40840-019-00750-8.  Google Scholar

[45]

T. Pahikkala, M. Viljanen, A. Airola and W. Waegeman, Spectral analysis of symmetric and antisymmetric pairwise kernels, preprint, arXiv: 1506.05950v1, (2015), 19 pp. Google Scholar

[46]

N. Ratliff and J. Bagnell, Kernel conjugate gradient for fast kernel machines, In IJCAI, 20 (2007), 1017-1021.   Google Scholar

[47]

W. Rejchel, On ranking and generalization bounds, J. Mach. Learn. Res., 13 (2012), 1373-1392.   Google Scholar

[48]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.  Google Scholar

[49]

H. Rosengren, Multivariable Chiristoffel-Darboux kernels and charateristic polynomials of random Hermitian matrices, Symmetry Integr. Geom. Meth. Appl., 2 (2006), Art. 085, 12 pp. doi: 10.3842/SIGMA.2006.085.  Google Scholar

[50] B. Sch$\ddot{o}$lkopfR. Herbrich and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2002.   Google Scholar
[51]

B. H. Sheng, The weighted norm for some Mercer kernel matrices, Acta. Math. Sci., 33A (2013), 6–15 (in Chinese).  Google Scholar

[52]

B. H. ShengL. Q. Duan and P. X. Ye, Strong convex loss can increase the learning rates of online learning, J. Computer, 9 (2014), 1606-1611.   Google Scholar

[53]

B. H. Sheng and J. L. Wang, On the K-functional in learning theory, Anal. Appl., 18 (2020), 423-446.  doi: 10.1142/S0219530519500192.  Google Scholar

[54]

B. H. Sheng and D. H. Xiang, The performance of semi-supervised laplacian regularized regression with least square loss, Int. J. Wavelets Multiresolut. Inform. Process., 15 (2017), Art. 1750016, 31 pp. doi: 10.1142/S0219691317500163.  Google Scholar

[55]

B. H. Sheng and H. Z. Zhang, Performance analysis of the LapRSSLG algorithm in learning theory, Anal. Appl., 18 (2020), 79-108.  doi: 10.1142/S0219530519410033.  Google Scholar

[56]

B. H. Sheng and H. C. Zhu, The convergence rate of semi-supervised regression with quadratic loss, Appl. Math. Comput., 321 (2018), 11-24.  doi: 10.1016/j.amc.2017.10.033.  Google Scholar

[57]

S. Smale and D. X. Zhou, Estimating the approximation error in learning theory, Anal. Appl., 1 (2003), 1-25.  doi: 10.1142/S0219530503000089.  Google Scholar

[58]

S. Smale and Y. Yao, Online learning algorithms, Found. Comput. Math., 6 (2006), 145-170.  doi: 10.1007/s10208-004-0160-z.  Google Scholar

[59]

I. Sprinkhuizen-Kuyper, Orthogonal polynomials in two variables. A further analysis of the polynomials orthogonal over a region bounded by two lines and a parabola, SIAM J. Math. Anal., 7 (1976), 501-518.  doi: 10.1137/0507041.  Google Scholar

[60]

I. Steinwart and C. Scovel, Mercer$\prime$s theorem on general domains: on the interaction between measures, kernels, and RKHSs, Constr. Approx., 35 (2012), 363-417.  doi: 10.1007/s00365-012-9153-3.  Google Scholar

[61]

H. Sun, Mercer theorem for RKHS on noncompact sets, J. Complexity, 21 (2005), 337-349.  doi: 10.1016/j.jco.2004.09.002.  Google Scholar

[62]

S. H. Wang, Z. L. Chen and B. H. Sheng, The convergence rate of SVM for kernel-based robust regression, Int. J. Wavelets Multiresolut. Inform. Process., 17 (2019), Art. 1950004. doi: 10.1142/S0219691319500048.  Google Scholar

[63]

C. Wang and T. Hu, Online regularized rairwise learning with least squares loss, Anal. Appl., 18 (2020), 49-78.  doi: 10.1142/S0219530519410070.  Google Scholar

[64]

Y. Wang, R. Khardon, D. Pechyony and R. Jones, Generalization bounds for online learning algorithms with pairwise loss functions, in JMLR: Workshop and Conference Proceedings, Annual Conference on Learning Theory, 23 (2012), 13.1–13.22. Google Scholar

[65]

Q. WuY. M. Ying and D. X. Zhou, Multi-kernel regularized classifiers, J. Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.  Google Scholar

[66]

Z. M. Wu, Dynamical knot and shape parameter setting for simulating shock wave by using multi-quadratic quasi-interpolation, Eng. Anal. Bound. Elem., 29 (2005), 354-358.   Google Scholar

[67]

Z. M. Wu and R. Schaback, Shape preserving properties and convergence of univariate multiquadratic quasi-interpolation, Acta Math. Appl. Sin., 10 (1994), 441-446.  doi: 10.1007/BF02016334.  Google Scholar

[68]

D. H. Xiang, Conditional quantiles with varying Gaussians, Adv. Comput. Math., 38 (2013), 723-735.  doi: 10.1007/s10444-011-9257-5.  Google Scholar

[69]

G. B. Ye and D. X. Zhou, Fully online classification by regularization, Appl. Comput. Harmon. Anal., 23 (2007), 198-214.  doi: 10.1016/j.acha.2006.12.001.  Google Scholar

[70]

Y. M. Ying and D. X. Zhou, Learnability of Gaussians with flexible variances, J. Mach. Learn. Res., 8 (2007), 249-276.   Google Scholar

[71]

Y. M. Ying and D. X. Zhou, Online regularized classification algorithms, IEEE. Trans. Inform. Theory, 52 (2006), 4775-4788.  doi: 10.1109/TIT.2006.883632.  Google Scholar

[72]

Y. M. Ying and D. X. Zhou, Online pairwise learning algorithms, Neural Comput., 28 (2016), 743-777.  doi: 10.1162/neco_a_00817.  Google Scholar

[73]

Y. M. Ying and D. X. Zhou, Unregularized online learning algorithms with general loss functions, Appl. Comput. Harmon. Anal., 42 (2017), 224-244.  doi: 10.1016/j.acha.2015.08.007.  Google Scholar

[74]

Y. X. Zeng and D. Klabjian, Online adaptive machine learning based algorithm for implied volatility surface modeling, Knowledge-Based Syst., 163 (2019), 376-391.   Google Scholar

[75]

L. L. Zhang and B. H. Sheng, Online regularized generalized gradient classification algorithms, Anal. Theory Appl., 26 (2010), 278-300.  doi: 10.1007/s10496-010-0278-6.  Google Scholar

[76]

Y. L. ZhaoJ. Fan and L. Shi, Learning rates for regularized least squares ranking algorithm, Anal. Appl., 15 (2017), 815-836.  doi: 10.1142/S0219530517500063.  Google Scholar

show all references

References:
[1]

S. Agarwal and P. Niyogi, Generalization bounds for ranking algorithms via algorithms via algorithmic stability, J. Mach. Learn. Res., 10 (2009), 441-474.   Google Scholar

[2]

R. AktassA. Altin and F. Tasdelen, A note on a family of two-variable polynomials, J. Comput. Appl. Math., 235 (2011), 4825-4833.  doi: 10.1016/j.cam.2010.11.005.  Google Scholar

[3]

A. AltinR. Aktas and E. Erkus-Duman, On a multivariable extension for the extended Jacobi polynomials, J. Math. Anal. Appl., 353 (2009), 121-133.  doi: 10.1016/j.jmaa.2008.11.070.  Google Scholar

[4]

P. L. BartlettO. Bousquet and S. Mendelson, Local rademacher complexities, Ann. Stat., 33 (2005), 1497-1537.  doi: 10.1214/009053605000000282.  Google Scholar

[5]

P. L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res., 3 (2002), 463-482.  doi: 10.1162/153244303321897690.  Google Scholar

[6]

H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer-Verlag, New York, 2010. doi: 10.1007/978-1-4419-9467-7.  Google Scholar

[7]

A. Bezubik, J. Hrivn$\acute{a}$k and S. Po$\check{s}$ta, Two dimensional symmetric and antisymmetric exponential functions and orthogonal polynomials, J. Phys. Confer. Ser., 411 (2013), Art. 012007. doi: 10.1088/1742-6596/411/1/012007.  Google Scholar

[8]

M. Boissier, S. Lyu, Y. Ying and D. X. Zhou, Fast convergence of online pairwise learning algorithms, in Proceedings of the 19th International Conference on Artifical Intelligence and Statistics, (2016) 204–212. Google Scholar

[9] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge, Cambridge University Press, 2004.  doi: 10.1017/CBO9780511804441.  Google Scholar
[10]

C. BrunnerA. FischerK. Luig and T. Thies, Pairwise support vector machines and their application to large scale problems, J. Mach. Learn. Res., 13 (2012), 2279-2292.   Google Scholar

[11]

Q. CaoZ. C. Guo and Y. M. Ying, Generalization bounds for metric and similarity learning, Mach. Learn., 102 (2016), 115-132.  doi: 10.1007/s10994-015-5499-7.  Google Scholar

[12]

H. ChenZ. B. Pan and L. Q. Li, Learning performance of coefficient-based regularized ranking, Neurocomputing, 133 (2014), 54-62.   Google Scholar

[13]

H. Chen and J. T. Wu, Regularized ranking with convex losses and $\ell^{1}$ penalty, Abst. Appl. Anal., (2013), Art. 927827, 8 pp. doi: 10.1155/2013/927827.  Google Scholar

[14]

R. H. Chen and Z. M. Wu, Solving partial differential equation by using multiquadratic quasi-interpolation, Appl. Math. Comput., 186 (2007), 1502-1510.  doi: 10.1016/j.amc.2006.07.160.  Google Scholar

[15]

X. M. Chen and Y. W. Lei, Refined bounds for online pairwise learning algorithms, Neurocomputing, 275 (2018), 2656-2665.   Google Scholar

[16]

S. O. Cheng and A. J. Smola, Hyperkernels, Conference Paper, 2002. Available from: https://www.researchgate.net/publication/221618045. Google Scholar

[17]

S. O. ChengA. J. Smola and R. C. Williamson, Learning the kernel with hyperkernels, J. Mach. Learn. Res., 6 (2005), 1043-1071.   Google Scholar

[18]

A. Christmann and D. X. Zhou, On the robustness of regularized pairwise learning methods based on kernels, J. Complexity, 37 (2016), 1-33.  doi: 10.1016/j.jco.2016.07.001.  Google Scholar

[19]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc., 39 (2002), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.  Google Scholar

[20] F. Cucker and D. X. Zhou, Learning Theory: An Approximation Theory Viewpoint, Cambridge University Press, New York, 2007.  doi: 10.1017/CBO9780511618796.  Google Scholar
[21]

S. R. DasKu hooD. Mishra and M. Rout, An optimized feature reduction based currency forecasting model exploring the online sequential extreme learning machine and krill herd strategies, Physica A, 513 (2019), 339-370.   Google Scholar

[22] C. F. Dunkl and Y. Xu, Orthogonal Polynomials of Several Variables, Cambridge University Press, 2014.  doi: 10.1017/CBO9781107786134.  Google Scholar
[23]

Q. FangM. Xu and Y. M. Ying, Faster convergence of a randomized coordinate descent method for linearly constrained optimization problems, Anal. Appl., 16 (2018), 741-755.  doi: 10.1142/S0219530518500082.  Google Scholar

[24]

L. Fern$\acute{a}$ndezT. E. P$\acute{e}$rez and M. A. Pi$\tilde{n}$ar, On Koornwinder classical orthogonal polynomials in two variables, J. Comput. Appl. Math., 236 (2012), 3817-3826.  doi: 10.1016/j.cam.2011.08.017.  Google Scholar

[25]

W. W. Gao and Z. G. Wang, A meshless scheme for partial differential equations based on multiquadratic trigonometric b-spline quasi-interpolation, Chin. Phys. B, 23 (2014), Art. 110207. Google Scholar

[26]

Z. C. GuoY. M. Ying and D. X. Zhou, Online regularized learning with pairwise loss functions, Adv. Comput. Math., 43 (2017), 127-150.  doi: 10.1007/s10444-016-9479-7.  Google Scholar

[27]

T. HuJ. Fan and D. H. Xiang, Convergence analysis of distributed multi-penalty regularized pairwise learning, Anal. Appl., 18 (2020), 109-127.  doi: 10.1142/S0219530519410045.  Google Scholar

[28]

S. Karlin and J. Gregor, Determinants of orthogonal polynomials, Bull. Amer. Math. Soc., 68 (1962), 204-209.  doi: 10.1090/S0002-9904-1962-10749-6.  Google Scholar

[29]

J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Ann. Math. Stati., 23 (1952), 462-466.  doi: 10.1214/aoms/1177729392.  Google Scholar

[30]

A. Klimyk and J. Patera, (Anti)Symmetric multivariable exponential functions and corresponding Fourier transforms, J. Phys. A Math. Theor., 40 (2007), 10473-10489.  doi: 10.1088/1751-8113/40/34/006.  Google Scholar

[31]

T. Koornwinder, Two-variable analogues of the classical orthogonal polynomials, in Theory and Application of Special Functions (eds. R. A. Askey), Academic Press, New York, (1975), 435–495.  Google Scholar

[32]

T. Koornwinder, Yet another proof of the addition formula for Jacobi polynomials, J. Math. Anal. Appl., 61 (1977), 136-141.  doi: 10.1016/0022-247X(77)90149-4.  Google Scholar

[33]

T. Koornwinder, Orthogonal polynomials in two variables which are eigenfunctions of two algebraically independeng partial differential operators, I, II, Indag Math., 36 (1974), 48-66.   Google Scholar

[34]

H. L. Krall and I. M. Sheffer, Orthogonal polynomials in two variables, Ann. Mat. Pura Appl., 76 (1967), 325-376.  doi: 10.1007/BF02412238.  Google Scholar

[35]

G. KriukovaS. V. Pereverzyev and P. Tkachenko, On the convergence rate and some applications of regularized ranking algorithms, J. Complexity, 33 (2016), 14-29.  doi: 10.1016/j.jco.2015.09.004.  Google Scholar

[36]

D. W. Lee, Partial differential equations for products of two classical orthogonal polynomials, Bull. Korean Math. Soc., 42 (2005), 179-188.  doi: 10.4134/BKMS.2005.42.1.179.  Google Scholar

[37]

S. B. LinJ. S. ZengJ. Fang and Z. B. Xu, Learning rates of $l^q$ coefficient regularization learning with Gaussian kernel, Neural Comput., 26 (2014), 2359-2378.  doi: 10.1162/NECO_a_00641.  Google Scholar

[38]

J. H. Lin and D. X. Zhou, Online learning algorithms can converge comparably fast as batch learning, IEEE Trans. Neural Netw. Learn. Syst., 29 (2017), 2367-2378.  doi: 10.1109/TNNLS.2017.2677970.  Google Scholar

[39]

J. H. LinY. W. LeiB. Zhang and D. X. Zhou, Online pairwise learning algorithms with convex loss functions, Inform. Sci., 406-407 (2017), 57-70.  doi: 10.1162/neco_a_00817.  Google Scholar

[40]

H. X. LiuB. H. Sheng and P. X. Ye, The improved learning rate for regularized regression with RKBSs, Int. J. Math. Learn. Cyber., 8 (2017), 1235-1245.   Google Scholar

[41]

P. Malave, Multi-dimensional Orthogonal Polynomials, Ph.D thesis, Marathwada, Univ. Aurangabad, 1979. Google Scholar

[42] L. MasonJ. BaxterP. Bartlett and M. Frean, Functional Gradient Techniques for Combining Hypotheses, MIT Press, 1999.   Google Scholar
[43]

R. Meir and T. Zhang, Generalization error bounds for Bayesian mixture algorithms, J. Mach. Learn. Res., 4 (2003), 839-860.  doi: 10.1162/1532443041424300.  Google Scholar

[44]

G. V. Milovanovi$\acute{c}$G. $\ddot{O}$zt$\ddot{u}$rk and R. Aktas, Properties of some of two-variable orthogonal polynomials, Bull. Malay. Math. Sci. Soc., 43 (2020), 1403-1431.  doi: 10.1007/s40840-019-00750-8.  Google Scholar

[45]

T. Pahikkala, M. Viljanen, A. Airola and W. Waegeman, Spectral analysis of symmetric and antisymmetric pairwise kernels, preprint, arXiv: 1506.05950v1, (2015), 19 pp. Google Scholar

[46]

N. Ratliff and J. Bagnell, Kernel conjugate gradient for fast kernel machines, In IJCAI, 20 (2007), 1017-1021.   Google Scholar

[47]

W. Rejchel, On ranking and generalization bounds, J. Mach. Learn. Res., 13 (2012), 1373-1392.   Google Scholar

[48]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.  Google Scholar

[49]

H. Rosengren, Multivariable Chiristoffel-Darboux kernels and charateristic polynomials of random Hermitian matrices, Symmetry Integr. Geom. Meth. Appl., 2 (2006), Art. 085, 12 pp. doi: 10.3842/SIGMA.2006.085.  Google Scholar

[50] B. Sch$\ddot{o}$lkopfR. Herbrich and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2002.   Google Scholar
[51]

B. H. Sheng, The weighted norm for some Mercer kernel matrices, Acta. Math. Sci., 33A (2013), 6–15 (in Chinese).  Google Scholar

[52]

B. H. ShengL. Q. Duan and P. X. Ye, Strong convex loss can increase the learning rates of online learning, J. Computer, 9 (2014), 1606-1611.   Google Scholar

[53]

B. H. Sheng and J. L. Wang, On the K-functional in learning theory, Anal. Appl., 18 (2020), 423-446.  doi: 10.1142/S0219530519500192.  Google Scholar

[54]

B. H. Sheng and D. H. Xiang, The performance of semi-supervised laplacian regularized regression with least square loss, Int. J. Wavelets Multiresolut. Inform. Process., 15 (2017), Art. 1750016, 31 pp. doi: 10.1142/S0219691317500163.  Google Scholar

[55]

B. H. Sheng and H. Z. Zhang, Performance analysis of the LapRSSLG algorithm in learning theory, Anal. Appl., 18 (2020), 79-108.  doi: 10.1142/S0219530519410033.  Google Scholar

[56]

B. H. Sheng and H. C. Zhu, The convergence rate of semi-supervised regression with quadratic loss, Appl. Math. Comput., 321 (2018), 11-24.  doi: 10.1016/j.amc.2017.10.033.  Google Scholar

[57]

S. Smale and D. X. Zhou, Estimating the approximation error in learning theory, Anal. Appl., 1 (2003), 1-25.  doi: 10.1142/S0219530503000089.  Google Scholar

[58]

S. Smale and Y. Yao, Online learning algorithms, Found. Comput. Math., 6 (2006), 145-170.  doi: 10.1007/s10208-004-0160-z.  Google Scholar

[59]

I. Sprinkhuizen-Kuyper, Orthogonal polynomials in two variables. A further analysis of the polynomials orthogonal over a region bounded by two lines and a parabola, SIAM J. Math. Anal., 7 (1976), 501-518.  doi: 10.1137/0507041.  Google Scholar

[60]

I. Steinwart and C. Scovel, Mercer$\prime$s theorem on general domains: on the interaction between measures, kernels, and RKHSs, Constr. Approx., 35 (2012), 363-417.  doi: 10.1007/s00365-012-9153-3.  Google Scholar

[61]

H. Sun, Mercer theorem for RKHS on noncompact sets, J. Complexity, 21 (2005), 337-349.  doi: 10.1016/j.jco.2004.09.002.  Google Scholar

[62]

S. H. Wang, Z. L. Chen and B. H. Sheng, The convergence rate of SVM for kernel-based robust regression, Int. J. Wavelets Multiresolut. Inform. Process., 17 (2019), Art. 1950004. doi: 10.1142/S0219691319500048.  Google Scholar

[63]

C. Wang and T. Hu, Online regularized rairwise learning with least squares loss, Anal. Appl., 18 (2020), 49-78.  doi: 10.1142/S0219530519410070.  Google Scholar

[64]

Y. Wang, R. Khardon, D. Pechyony and R. Jones, Generalization bounds for online learning algorithms with pairwise loss functions, in JMLR: Workshop and Conference Proceedings, Annual Conference on Learning Theory, 23 (2012), 13.1–13.22. Google Scholar

[65]

Q. WuY. M. Ying and D. X. Zhou, Multi-kernel regularized classifiers, J. Complexity, 23 (2007), 108-134.  doi: 10.1016/j.jco.2006.06.007.  Google Scholar

[66]

Z. M. Wu, Dynamical knot and shape parameter setting for simulating shock wave by using multi-quadratic quasi-interpolation, Eng. Anal. Bound. Elem., 29 (2005), 354-358.   Google Scholar

[67]

Z. M. Wu and R. Schaback, Shape preserving properties and convergence of univariate multiquadratic quasi-interpolation, Acta Math. Appl. Sin., 10 (1994), 441-446.  doi: 10.1007/BF02016334.  Google Scholar

[68]

D. H. Xiang, Conditional quantiles with varying Gaussians, Adv. Comput. Math., 38 (2013), 723-735.  doi: 10.1007/s10444-011-9257-5.  Google Scholar

[69]

G. B. Ye and D. X. Zhou, Fully online classification by regularization, Appl. Comput. Harmon. Anal., 23 (2007), 198-214.  doi: 10.1016/j.acha.2006.12.001.  Google Scholar

[70]

Y. M. Ying and D. X. Zhou, Learnability of Gaussians with flexible variances, J. Mach. Learn. Res., 8 (2007), 249-276.   Google Scholar

[71]

Y. M. Ying and D. X. Zhou, Online regularized classification algorithms, IEEE. Trans. Inform. Theory, 52 (2006), 4775-4788.  doi: 10.1109/TIT.2006.883632.  Google Scholar

[72]

Y. M. Ying and D. X. Zhou, Online pairwise learning algorithms, Neural Comput., 28 (2016), 743-777.  doi: 10.1162/neco_a_00817.  Google Scholar

[73]

Y. M. Ying and D. X. Zhou, Unregularized online learning algorithms with general loss functions, Appl. Comput. Harmon. Anal., 42 (2017), 224-244.  doi: 10.1016/j.acha.2015.08.007.  Google Scholar

[74]

Y. X. Zeng and D. Klabjian, Online adaptive machine learning based algorithm for implied volatility surface modeling, Knowledge-Based Syst., 163 (2019), 376-391.   Google Scholar

[75]

L. L. Zhang and B. H. Sheng, Online regularized generalized gradient classification algorithms, Anal. Theory Appl., 26 (2010), 278-300.  doi: 10.1007/s10496-010-0278-6.  Google Scholar

[76]

Y. L. ZhaoJ. Fan and L. Shi, Learning rates for regularized least squares ranking algorithm, Anal. Appl., 15 (2017), 815-836.  doi: 10.1142/S0219530517500063.  Google Scholar

[1]

Baohuai Sheng, Huanxiang Liu, Huimin Wang. Learning rates for the kernel regularized regression with a differentiable strongly convex loss. Communications on Pure & Applied Analysis, 2020, 19 (8) : 3973-4005. doi: 10.3934/cpaa.2020176

[2]

Ying Lin, Rongrong Lin, Qi Ye. Sparse regularized learning in the reproducing kernel banach spaces with the $ \ell^1 $ norm. Mathematical Foundations of Computing, 2020, 3 (3) : 205-218. doi: 10.3934/mfc.2020020

[3]

Kaitlyn (Voccola) Muller. A reproducing kernel Hilbert space framework for inverse scattering problems within the Born approximation. Inverse Problems & Imaging, 2019, 13 (6) : 1327-1348. doi: 10.3934/ipi.2019058

[4]

Ning Zhang, Qiang Wu. Online learning for supervised dimension reduction. Mathematical Foundations of Computing, 2019, 2 (2) : 95-106. doi: 10.3934/mfc.2019008

[5]

Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete & Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503

[6]

Roberto C. Alamino, Nestor Caticha. Bayesian online algorithms for learning in discrete hidden Markov models. Discrete & Continuous Dynamical Systems - B, 2008, 9 (1) : 1-10. doi: 10.3934/dcdsb.2008.9.1

[7]

Noémi Nagy, Péter L. Simon. Detailed analytic study of the compact pairwise model for SIS epidemic propagation on networks. Discrete & Continuous Dynamical Systems - B, 2020, 25 (1) : 99-115. doi: 10.3934/dcdsb.2019174

[8]

Liam Burrows, Weihong Guo, Ke Chen, Francesco Torella. Reproducible Kernel Hilbert Space Based Global and Local Image Segmentation. Inverse Problems & Imaging, 2020, 0 (0) : 1-25. doi: 10.3934/ipi.2020048

[9]

Ali Akgül, Mustafa Inc, Esra Karatas. Reproducing kernel functions for difference equations. Discrete & Continuous Dynamical Systems - S, 2015, 8 (6) : 1055-1064. doi: 10.3934/dcdss.2015.8.1055

[10]

Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031

[11]

Andreas Chirstmann, Qiang Wu, Ding-Xuan Zhou. Preface to the special issue on analysis in machine learning and data science. Communications on Pure & Applied Analysis, 2020, 19 (8) : i-iii. doi: 10.3934/cpaa.2020171

[12]

Andrew E.B. Lim, John B. Moore. A path following algorithm for infinite quadratic programming on a Hilbert space. Discrete & Continuous Dynamical Systems - A, 1998, 4 (4) : 653-670. doi: 10.3934/dcds.1998.4.653

[13]

Alan Beggs. Learning in monotone bayesian games. Journal of Dynamics & Games, 2015, 2 (2) : 117-140. doi: 10.3934/jdg.2015.2.117

[14]

Yangyang Xu, Wotao Yin, Stanley Osher. Learning circulant sensing kernels. Inverse Problems & Imaging, 2014, 8 (3) : 901-923. doi: 10.3934/ipi.2014.8.901

[15]

Mauro Maggioni, James M. Murphy. Learning by active nonlinear diffusion. Foundations of Data Science, 2019, 1 (3) : 271-291. doi: 10.3934/fods.2019012

[16]

Nicolás M. Crisosto, Christopher M. Kribs-Zaleta, Carlos Castillo-Chávez, Stephen Wirkus. Community resilience in collaborative learning. Discrete & Continuous Dynamical Systems - B, 2010, 14 (1) : 17-40. doi: 10.3934/dcdsb.2010.14.17

[17]

Guangzhou Chen, Yue Guo, Yong Zhang. Further results on the existence of super-simple pairwise balanced designs with block sizes 3 and 4. Advances in Mathematics of Communications, 2018, 12 (2) : 351-362. doi: 10.3934/amc.2018022

[18]

Chuandong Li, Fali Ma, Tingwen Huang. 2-D analysis based iterative learning control for linear discrete-time systems with time delay. Journal of Industrial & Management Optimization, 2011, 7 (1) : 175-181. doi: 10.3934/jimo.2011.7.175

[19]

Minlong Lin, Ke Tang. Selective further learning of hybrid ensemble for class imbalanced increment learning. Big Data & Information Analytics, 2017, 2 (1) : 1-21. doi: 10.3934/bdia.2017005

[20]

Libin Mou, Jiongmin Yong. Two-person zero-sum linear quadratic stochastic differential games by a Hilbert space method. Journal of Industrial & Management Optimization, 2006, 2 (1) : 95-117. doi: 10.3934/jimo.2006.2.95

2019 Impact Factor: 1.105

Metrics

  • PDF downloads (40)
  • HTML views (63)
  • Cited by (0)

Other articles
by authors

[Back to Top]