August  2020, 3(3): 205-218. doi: 10.3934/mfc.2020020

Sparse regularized learning in the reproducing kernel banach spaces with the $ \ell^1 $ norm

1. 

School of Mathematical Sciences, South China Normal University, Guangzhou, Guangdong 510631, China

2. 

School of Data and Computer Science, Sun Yat-sen University, Guangzhou, Guangdong 510006, China

* Corresponding author: Rongrong Lin

Received  November 2019 Revised  June 2020 Published  June 2020

We present a sparse representer theorem for regularization networks in a reproducing kernel Banach space with the $ \ell^1 $ norm by the theory of convex analysis. The theorem states that extreme points of the solution set of regularization networks in such a sparsity-promoting space belong to the span of kernel functions centered on at most $ n $ adaptive points of the input space, where $ n $ is the number of training data. Under the Lebesgue constant assumptions on reproducing kernels, we can recover the relaxed representer theorem and the exact representer theorem in that space in the literature. Finally, we perform numerical experiments for synthetic data and real-world benchmark data in the reproducing kernel Banach spaces with the $ \ell^1 $ norm and the reproducing kernel Hilbert spaces both with Laplacian kernels. The numerical performance demonstrates the advantages of sparse regularized learning.

Citation: Ying Lin, Rongrong Lin, Qi Ye. Sparse regularized learning in the reproducing kernel banach spaces with the $ \ell^1 $ norm. Mathematical Foundations of Computing, 2020, 3 (3) : 205-218. doi: 10.3934/mfc.2020020
References:
[1]

S. BoydN. ParikhE. ChuB. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., 3 (2011), 1-122.  doi: 10.1561/2200000016.  Google Scholar

[2]

C. BoyerA. ChambolleY. De CastroV. DuvalF. de Gournay and P. Weiss, On representer theorems and convex regularization, SIAM J. Optim., 29 (2019), 1260-1281.  doi: 10.1137/18M1200750.  Google Scholar

[3]

O. Christensen, An Introduction to Frames and Riesz Bases, 2nd edition, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer, [Cham], 2016. doi: 10.1007/978-3-319-25613-9.  Google Scholar

[4]

H. G. Dales, F. K. Dashiell Jr., A. T.-M. Lau and D. Strauss, Banach Spaces of Continuous Functions as Dual Spaces, CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC, Springer, Cham, 2016. doi: 10.1007/978-3-319-32349-7.  Google Scholar

[5]

S. De Marchi and R. Schaback, Stability of kernel-based interpolation, Adv. Comput. Math., 32 (2010), 155-161.  doi: 10.1007/s10444-008-9093-4.  Google Scholar

[6]

D. L. Donoho, For most large underdetermined systems of equations, the minimal $l_1$-norm near-solution approximates the sparsest near-solution, Comm. Pure Appl. Math., 59 (2006), 907-934.  doi: 10.1002/cpa.20131.  Google Scholar

[7]

G. Fasshauer and M. McCourt, Kernel-Based Approximation Methods using MATLAB, Interdisciplinary Mathematical Sciences. Vol. 19, World Scientific, 2015. Google Scholar

[8]

Z.-C. Guo and L. Shi, Learning with coefficient-based regularization and $\ell^1$-penalty, Adv. Comput. Math., 39 (2013), 493-510.  doi: 10.1007/s10444-012-9288-6.  Google Scholar

[9]

T. HangelbroekF. J. Narcowich and J. D. Ward, Kernel approximation on manifolds I: bounding the Lebesgue constant, SIAM J. Math. Anal., 42 (2010), 1732-1760.  doi: 10.1137/090769570.  Google Scholar

[10]

L. Huang, C. Liu, L. Tan and Q. Ye, Generalized representer theorems in Banach spaces, Anal. Appl. (Singap.), (2019). doi: 10.1142/S0219530519410100.  Google Scholar

[11]

V. Klee, On a theorem of Dubins, J. Math. Anal. Appl., 7 (1963), 425-427.  doi: 10.1016/0022-247X(63)90063-5.  Google Scholar

[12]

Z. Li, Y. Xu and Q. Ye, Sparse support vector machines in reproducing kernel banach spaces, Contemporary Computational Mathematics-A Celebration of the 80th Birthday of Ian Sloan, Springer, Cham, 1 (2018), 869–887.  Google Scholar

[13]

R. Lin, G. Song and H. Zhang, Multi-task learning in vector-valued reproducing kernel Banach spaces with the $\ell^1$ norm, https://arXiv.org/abs/1901.01036. Google Scholar

[14]

R. Lin, H. Zhang and J. Zhang, On reproducing kernel Banach spaces: Generic definitions and unified framework of constructions, https://arXiv.org/abs/1901.01002. Google Scholar

[15]

A. Rudi, R. Camoriano and L. Rosasco, Less is more: Nyström computational regularization, in Advances in Neural Information Processing Systems, (2015), 1657–1665. Google Scholar

[16]

K. Schlegel, When is there a representer theorem? Nondifferentiable regularisers and Banach spaces, J. Global Optim., 74 (2019), 401-415.  doi: 10.1007/s10898-019-00767-0.  Google Scholar

[17] B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, The MIT Press, Cambridge, 2001.   Google Scholar
[18] B. Simon, Convexity: An Analytic Viewpoint, vol. 187 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge, 2011.  doi: 10.1017/CBO9780511910135.  Google Scholar
[19]

G. Song and H. Zhang, Reproducing kernel Banach spaces with the $\ell^1$ norm ii: Error analysis for regularized least square regression, Neural Comput., 23 (2011), 2713-2729.  doi: 10.1162/NECO_a_00178.  Google Scholar

[20]

G. SongH. Zhang and F. J. Hickernell, Reproducing kernel Banach spaces with the $\ell^1$ norm, Appl. Comput. Harmon. Anal., 34 (2013), 96-116.  doi: 10.1016/j.acha.2012.03.009.  Google Scholar

[21]

B. K. SriperumbudurK. Fukumizu and G. R. G. Lanckriet, Universality, characteristic kernels and RKHS embedding of measures, J. Mach. Learn. Res., 12 (2011), 2389-2410.   Google Scholar

[22]

I. Steinwart and A. Christmann, Support Vector Machines, Information Science and Statistics, Springer, New York, 2008.  Google Scholar

[23]

M. UnserJ. Fageot and H. Gupta, Representer theorems for sparsity-promoting $\ell_1$ regularization, IEEE Trans. Inform. Theory, 62 (2016), 5167-5180.  doi: 10.1109/TIT.2016.2590421.  Google Scholar

[24]

Y. Xu and Q. Ye, Generalized Mercer kernels and reproducing kernel Banach spaces, Mem. Amer. Math. Soc., 258 (2019), no. 1243,122 pp. doi: 10.1090/memo/1243.  Google Scholar

[25]

H. ZhangY. Xu and J. Zhang, Reproducing kernel Banach spaces for machine learning, J. Mach. Learn. Res., 10 (2009), 2741-2775.  doi: 10.1109/IJCNN.2009.5179093.  Google Scholar

[26]

H. Zhang and L. Zhao, On the inclusion relation of reproducing kernel Hilbert spaces, Anal. Appl. (Singap.), 11 (2013), 1350014, 31pp. doi: 10.1142/S0219530513500140.  Google Scholar

show all references

References:
[1]

S. BoydN. ParikhE. ChuB. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Found. Trends Mach. Learn., 3 (2011), 1-122.  doi: 10.1561/2200000016.  Google Scholar

[2]

C. BoyerA. ChambolleY. De CastroV. DuvalF. de Gournay and P. Weiss, On representer theorems and convex regularization, SIAM J. Optim., 29 (2019), 1260-1281.  doi: 10.1137/18M1200750.  Google Scholar

[3]

O. Christensen, An Introduction to Frames and Riesz Bases, 2nd edition, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer, [Cham], 2016. doi: 10.1007/978-3-319-25613-9.  Google Scholar

[4]

H. G. Dales, F. K. Dashiell Jr., A. T.-M. Lau and D. Strauss, Banach Spaces of Continuous Functions as Dual Spaces, CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC, Springer, Cham, 2016. doi: 10.1007/978-3-319-32349-7.  Google Scholar

[5]

S. De Marchi and R. Schaback, Stability of kernel-based interpolation, Adv. Comput. Math., 32 (2010), 155-161.  doi: 10.1007/s10444-008-9093-4.  Google Scholar

[6]

D. L. Donoho, For most large underdetermined systems of equations, the minimal $l_1$-norm near-solution approximates the sparsest near-solution, Comm. Pure Appl. Math., 59 (2006), 907-934.  doi: 10.1002/cpa.20131.  Google Scholar

[7]

G. Fasshauer and M. McCourt, Kernel-Based Approximation Methods using MATLAB, Interdisciplinary Mathematical Sciences. Vol. 19, World Scientific, 2015. Google Scholar

[8]

Z.-C. Guo and L. Shi, Learning with coefficient-based regularization and $\ell^1$-penalty, Adv. Comput. Math., 39 (2013), 493-510.  doi: 10.1007/s10444-012-9288-6.  Google Scholar

[9]

T. HangelbroekF. J. Narcowich and J. D. Ward, Kernel approximation on manifolds I: bounding the Lebesgue constant, SIAM J. Math. Anal., 42 (2010), 1732-1760.  doi: 10.1137/090769570.  Google Scholar

[10]

L. Huang, C. Liu, L. Tan and Q. Ye, Generalized representer theorems in Banach spaces, Anal. Appl. (Singap.), (2019). doi: 10.1142/S0219530519410100.  Google Scholar

[11]

V. Klee, On a theorem of Dubins, J. Math. Anal. Appl., 7 (1963), 425-427.  doi: 10.1016/0022-247X(63)90063-5.  Google Scholar

[12]

Z. Li, Y. Xu and Q. Ye, Sparse support vector machines in reproducing kernel banach spaces, Contemporary Computational Mathematics-A Celebration of the 80th Birthday of Ian Sloan, Springer, Cham, 1 (2018), 869–887.  Google Scholar

[13]

R. Lin, G. Song and H. Zhang, Multi-task learning in vector-valued reproducing kernel Banach spaces with the $\ell^1$ norm, https://arXiv.org/abs/1901.01036. Google Scholar

[14]

R. Lin, H. Zhang and J. Zhang, On reproducing kernel Banach spaces: Generic definitions and unified framework of constructions, https://arXiv.org/abs/1901.01002. Google Scholar

[15]

A. Rudi, R. Camoriano and L. Rosasco, Less is more: Nyström computational regularization, in Advances in Neural Information Processing Systems, (2015), 1657–1665. Google Scholar

[16]

K. Schlegel, When is there a representer theorem? Nondifferentiable regularisers and Banach spaces, J. Global Optim., 74 (2019), 401-415.  doi: 10.1007/s10898-019-00767-0.  Google Scholar

[17] B. Schölkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, The MIT Press, Cambridge, 2001.   Google Scholar
[18] B. Simon, Convexity: An Analytic Viewpoint, vol. 187 of Cambridge Tracts in Mathematics, Cambridge University Press, Cambridge, 2011.  doi: 10.1017/CBO9780511910135.  Google Scholar
[19]

G. Song and H. Zhang, Reproducing kernel Banach spaces with the $\ell^1$ norm ii: Error analysis for regularized least square regression, Neural Comput., 23 (2011), 2713-2729.  doi: 10.1162/NECO_a_00178.  Google Scholar

[20]

G. SongH. Zhang and F. J. Hickernell, Reproducing kernel Banach spaces with the $\ell^1$ norm, Appl. Comput. Harmon. Anal., 34 (2013), 96-116.  doi: 10.1016/j.acha.2012.03.009.  Google Scholar

[21]

B. K. SriperumbudurK. Fukumizu and G. R. G. Lanckriet, Universality, characteristic kernels and RKHS embedding of measures, J. Mach. Learn. Res., 12 (2011), 2389-2410.   Google Scholar

[22]

I. Steinwart and A. Christmann, Support Vector Machines, Information Science and Statistics, Springer, New York, 2008.  Google Scholar

[23]

M. UnserJ. Fageot and H. Gupta, Representer theorems for sparsity-promoting $\ell_1$ regularization, IEEE Trans. Inform. Theory, 62 (2016), 5167-5180.  doi: 10.1109/TIT.2016.2590421.  Google Scholar

[24]

Y. Xu and Q. Ye, Generalized Mercer kernels and reproducing kernel Banach spaces, Mem. Amer. Math. Soc., 258 (2019), no. 1243,122 pp. doi: 10.1090/memo/1243.  Google Scholar

[25]

H. ZhangY. Xu and J. Zhang, Reproducing kernel Banach spaces for machine learning, J. Mach. Learn. Res., 10 (2009), 2741-2775.  doi: 10.1109/IJCNN.2009.5179093.  Google Scholar

[26]

H. Zhang and L. Zhao, On the inclusion relation of reproducing kernel Hilbert spaces, Anal. Appl. (Singap.), 11 (2013), 1350014, 31pp. doi: 10.1142/S0219530513500140.  Google Scholar

Figure 1.  Numerical results of models (12) and (13) for Tai Chi data set are illustrated in Figure 1(a) and Figure 1(b), respectively.
Figure 2.  Numerical results of models (12) and (13) for the second data set are illustrated in Figure 2(a) and Figure 2(b), respectively
Table 1.  Lebesgue constants of the Laplacian kernel $ e^{-\|x-x'\|_2} $ on a set of $ n $ grid points of $ [-1, 1]^2 $
n=100 n=400 n=900 n=1600 n=2500
1.237204 1.244770 1.249653 1.246516 1.246808
n=100 n=400 n=900 n=1600 n=2500
1.237204 1.244770 1.249653 1.246516 1.246808
Table 2.  Lebesgue constants of the Laplacian kernel $ e^{-\|x-x'\|_2} $, $ x, x'\in{\mathbb R}^3 $ on a set of $ n $ grid points of $ [-1, 1]^3 $
n=125 n=1000 n=1728 n=2197 n=3375
1.624012 1.705158 1.709775 1.711243 1.712867
n=125 n=1000 n=1728 n=2197 n=3375
1.624012 1.705158 1.709775 1.711243 1.712867
[1]

Federico Rodriguez Hertz, Zhiren Wang. On $ \epsilon $-escaping trajectories in homogeneous spaces. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 329-357. doi: 10.3934/dcds.2020365

[2]

Yichen Zhang, Meiqiang Feng. A coupled $ p $-Laplacian elliptic system: Existence, uniqueness and asymptotic behavior. Electronic Research Archive, 2020, 28 (4) : 1419-1438. doi: 10.3934/era.2020075

[3]

Lihong Zhang, Wenwen Hou, Bashir Ahmad, Guotao Wang. Radial symmetry for logarithmic Choquard equation involving a generalized tempered fractional $ p $-Laplacian. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020445

[4]

Mokhtar Bouloudene, Manar A. Alqudah, Fahd Jarad, Yassine Adjabi, Thabet Abdeljawad. Nonlinear singular $ p $ -Laplacian boundary value problems in the frame of conformable derivative. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020442

[5]

Mathew Gluck. Classification of solutions to a system of $ n^{\rm th} $ order equations on $ \mathbb R^n $. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5413-5436. doi: 10.3934/cpaa.2020246

[6]

Lei Liu, Li Wu. Multiplicity of closed characteristics on $ P $-symmetric compact convex hypersurfaces in $ \mathbb{R}^{2n} $. Discrete & Continuous Dynamical Systems - A, 2020  doi: 10.3934/dcds.2020378

[7]

Aihua Fan, Jörg Schmeling, Weixiao Shen. $ L^\infty $-estimation of generalized Thue-Morse trigonometric polynomials and ergodic maximization. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 297-327. doi: 10.3934/dcds.2020363

[8]

Denis Bonheure, Silvia Cingolani, Simone Secchi. Concentration phenomena for the Schrödinger-Poisson system in $ \mathbb{R}^2 $. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020447

[9]

Thabet Abdeljawad, Mohammad Esmael Samei. Applying quantum calculus for the existence of solution of $ q $-integro-differential equations with three criteria. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020440

[10]

Wenjun Liu, Yukun Xiao, Xiaoqing Yue. Classification of finite irreducible conformal modules over Lie conformal algebra $ \mathcal{W}(a, b, r) $. Electronic Research Archive, , () : -. doi: 10.3934/era.2020123

[11]

Zuliang Lu, Fei Huang, Xiankui Wu, Lin Li, Shang Liu. Convergence and quasi-optimality of $ L^2- $norms based an adaptive finite element method for nonlinear optimal control problems. Electronic Research Archive, 2020, 28 (4) : 1459-1486. doi: 10.3934/era.2020077

[12]

Sumit Arora, Manil T. Mohan, Jaydev Dabas. Approximate controllability of a Sobolev type impulsive functional evolution system in Banach spaces. Mathematical Control & Related Fields, 2020  doi: 10.3934/mcrf.2020049

[13]

Soniya Singh, Sumit Arora, Manil T. Mohan, Jaydev Dabas. Approximate controllability of second order impulsive systems with state-dependent delay in Banach spaces. Evolution Equations & Control Theory, 2020  doi: 10.3934/eect.2020103

[14]

Monia Capanna, Jean C. Nakasato, Marcone C. Pereira, Julio D. Rossi. Homogenization for nonlocal problems with smooth kernels. Discrete & Continuous Dynamical Systems - A, 2020  doi: 10.3934/dcds.2020385

[15]

Xuemei Chen, Julia Dobrosotskaya. Inpainting via sparse recovery with directional constraints. Mathematical Foundations of Computing, 2020, 3 (4) : 229-247. doi: 10.3934/mfc.2020025

[16]

Xin Guo, Lexin Li, Qiang Wu. Modeling interactive components by coordinate kernel polynomial models. Mathematical Foundations of Computing, 2020, 3 (4) : 263-277. doi: 10.3934/mfc.2020010

[17]

Huiying Fan, Tao Ma. Parabolic equations involving Laguerre operators and weighted mixed-norm estimates. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5487-5508. doi: 10.3934/cpaa.2020249

[18]

Bahaaeldin Abdalla, Thabet Abdeljawad. Oscillation criteria for kernel function dependent fractional dynamic equations. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020443

[19]

Wenmeng Geng, Kai Tao. Large deviation theorems for dirichlet determinants of analytic quasi-periodic jacobi operators with Brjuno-Rüssmann frequency. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5305-5335. doi: 10.3934/cpaa.2020240

[20]

Justin Holmer, Chang Liu. Blow-up for the 1D nonlinear Schrödinger equation with point nonlinearity II: Supercritical blow-up profiles. Communications on Pure & Applied Analysis, , () : -. doi: 10.3934/cpaa.2020264

 Impact Factor: 

Metrics

  • PDF downloads (88)
  • HTML views (241)
  • Cited by (0)

Other articles
by authors

[Back to Top]