February  2019, 2(1): 1-9. doi: 10.3934/mfc.2019001

Kernel-based online gradient descent using distributed approach

Shantou University, No. 243 Daxue Rd., Shantou, Guangdong, China

* Corresponding author: Xiaming Chen

Published  March 2019

Fund Project: The first author is supported by STU Scientific Research Foundation for Talents grant (NTF-18022)

In this paper we study the kernel-based online gradient descent with least squares loss without an explicit regularization term. Our approach is novel by controlling the expectation of the K-norm of $ f_t $ using an iterative process. Then we use distributed learning to improve our result.

Citation: Xiaming Chen. Kernel-based online gradient descent using distributed approach. Mathematical Foundations of Computing, 2019, 2 (1) : 1-9. doi: 10.3934/mfc.2019001
References:
[1]

X. Chen and Y. Lei, Refined Bounds for online pairwise learning algorithm, Neurocomputing, 275 (2018), 2656-2665.  doi: 10.1016/j.neucom.2017.11.049.  Google Scholar

[2]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc., 39 (2001), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.  Google Scholar

[3]

J. DuchiE. Hazan and Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research, 12 (2011), 2121-2159.   Google Scholar

[4]

T. Hu, Online regression with varying gaussians and non-identical distributions, Analysis and Applications, 9 (2011), 395-408.  doi: 10.1142/S0219530511001923.  Google Scholar

[5]

J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Ann. Math. Statist., 23 (1952), 462-466.  doi: 10.1214/aoms/1177729392.  Google Scholar

[6]

S. B. Lin, X Guo and D. X. Zhou, Distributed learning with regularized least squares, Journal of Machine Learning Research, 18 (2017), Paper No. 92, 31 pp.  Google Scholar

[7]

M. Pontil, Y. Ying and D. X. Zhou, Error analysis for online gradient descent algorithms in reproducing kernel Hilbert space, Technical Report, Department of Computer Science, University College London, (2005). Google Scholar

[8]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.  Google Scholar

[9]

S. Smale and Y. Yao, Online learning algorithms, Found. Comput. Math., 6 (2006), 145-170.  doi: 10.1007/s10208-004-0160-z.  Google Scholar

[10]

S. Smale and D. X. Zhou, Online learning with Markov sampling, Analysis and Applications, 7 (2009), 87-113.  doi: 10.1142/S0219530509001293.  Google Scholar

[11]

S. VijayakumarA. D'Souza and S. Schaal, Incremental online learning in high dimensions, Neural Computation, 17 (2005), 2602-2634.  doi: 10.1162/089976605774320557.  Google Scholar

[12]

C. Wang and T Hu, Online minimum error entropy algorithm with unbounded sampling, Analysis and Applications, 17 (2019), 293-322.  doi: 10.1142/S0219530518500148.  Google Scholar

[13]

J. Xu, Z. Zheng, Z. Fan and W. Liu, Online personalized QoS prediction approach for cloud services, 4th International Conference on Cloud Computing and Intelligence Systems, 2016. doi: 10.1109/CCIS.2016.7790220.  Google Scholar

[14]

Y. YaoL. Losasco and A. Caponnetto, Early stopping in gradient descent boosting, Constr. Approx., 26 (2007), 289-315.  doi: 10.1007/s00365-006-0663-2.  Google Scholar

[15]

Y. Ying and M. Pontil, Online pairwise learning algorithms, Found. Comput. Math., 28 (2016), 743-777.  doi: 10.1162/NECO_a_00817.  Google Scholar

[16]

Y. Ying and D. X. Zhou, Online regularized classification algorithm, IEEE, trans. Inform. Theory, 52 (2006), 4775-4788.  doi: 10.1109/TIT.2006.883632.  Google Scholar

[17]

Z. H. ZhouN. V. ChawlaY. Jin and G. J. Williams, Big data opportunities and challenges: Discussions from data analytics perspective, IEEE Computational Intelligence Magazine, 9 (2014), 62-74.   Google Scholar

show all references

References:
[1]

X. Chen and Y. Lei, Refined Bounds for online pairwise learning algorithm, Neurocomputing, 275 (2018), 2656-2665.  doi: 10.1016/j.neucom.2017.11.049.  Google Scholar

[2]

F. Cucker and S. Smale, On the mathematical foundations of learning, Bull. Amer. Math. Soc., 39 (2001), 1-49.  doi: 10.1090/S0273-0979-01-00923-5.  Google Scholar

[3]

J. DuchiE. Hazan and Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization, Journal of Machine Learning Research, 12 (2011), 2121-2159.   Google Scholar

[4]

T. Hu, Online regression with varying gaussians and non-identical distributions, Analysis and Applications, 9 (2011), 395-408.  doi: 10.1142/S0219530511001923.  Google Scholar

[5]

J. Kiefer and J. Wolfowitz, Stochastic estimation of the maximum of a regression function, Ann. Math. Statist., 23 (1952), 462-466.  doi: 10.1214/aoms/1177729392.  Google Scholar

[6]

S. B. Lin, X Guo and D. X. Zhou, Distributed learning with regularized least squares, Journal of Machine Learning Research, 18 (2017), Paper No. 92, 31 pp.  Google Scholar

[7]

M. Pontil, Y. Ying and D. X. Zhou, Error analysis for online gradient descent algorithms in reproducing kernel Hilbert space, Technical Report, Department of Computer Science, University College London, (2005). Google Scholar

[8]

H. Robbins and S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), 400-407.  doi: 10.1214/aoms/1177729586.  Google Scholar

[9]

S. Smale and Y. Yao, Online learning algorithms, Found. Comput. Math., 6 (2006), 145-170.  doi: 10.1007/s10208-004-0160-z.  Google Scholar

[10]

S. Smale and D. X. Zhou, Online learning with Markov sampling, Analysis and Applications, 7 (2009), 87-113.  doi: 10.1142/S0219530509001293.  Google Scholar

[11]

S. VijayakumarA. D'Souza and S. Schaal, Incremental online learning in high dimensions, Neural Computation, 17 (2005), 2602-2634.  doi: 10.1162/089976605774320557.  Google Scholar

[12]

C. Wang and T Hu, Online minimum error entropy algorithm with unbounded sampling, Analysis and Applications, 17 (2019), 293-322.  doi: 10.1142/S0219530518500148.  Google Scholar

[13]

J. Xu, Z. Zheng, Z. Fan and W. Liu, Online personalized QoS prediction approach for cloud services, 4th International Conference on Cloud Computing and Intelligence Systems, 2016. doi: 10.1109/CCIS.2016.7790220.  Google Scholar

[14]

Y. YaoL. Losasco and A. Caponnetto, Early stopping in gradient descent boosting, Constr. Approx., 26 (2007), 289-315.  doi: 10.1007/s00365-006-0663-2.  Google Scholar

[15]

Y. Ying and M. Pontil, Online pairwise learning algorithms, Found. Comput. Math., 28 (2016), 743-777.  doi: 10.1162/NECO_a_00817.  Google Scholar

[16]

Y. Ying and D. X. Zhou, Online regularized classification algorithm, IEEE, trans. Inform. Theory, 52 (2006), 4775-4788.  doi: 10.1109/TIT.2006.883632.  Google Scholar

[17]

Z. H. ZhouN. V. ChawlaY. Jin and G. J. Williams, Big data opportunities and challenges: Discussions from data analytics perspective, IEEE Computational Intelligence Magazine, 9 (2014), 62-74.   Google Scholar

[1]

Ning Zhang, Qiang Wu. Online learning for supervised dimension reduction. Mathematical Foundations of Computing, 2019, 2 (2) : 95-106. doi: 10.3934/mfc.2019008

[2]

Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete & Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503

[3]

Roberto C. Alamino, Nestor Caticha. Bayesian online algorithms for learning in discrete hidden Markov models. Discrete & Continuous Dynamical Systems - B, 2008, 9 (1) : 1-10. doi: 10.3934/dcdsb.2008.9.1

[4]

G. Calafiore, M.C. Campi. A learning theory approach to the construction of predictor models. Conference Publications, 2003, 2003 (Special) : 156-166. doi: 10.3934/proc.2003.2003.156

[5]

Alan Beggs. Learning in monotone bayesian games. Journal of Dynamics & Games, 2015, 2 (2) : 117-140. doi: 10.3934/jdg.2015.2.117

[6]

Yangyang Xu, Wotao Yin, Stanley Osher. Learning circulant sensing kernels. Inverse Problems & Imaging, 2014, 8 (3) : 901-923. doi: 10.3934/ipi.2014.8.901

[7]

Nicolás M. Crisosto, Christopher M. Kribs-Zaleta, Carlos Castillo-Chávez, Stephen Wirkus. Community resilience in collaborative learning. Discrete & Continuous Dynamical Systems - B, 2010, 14 (1) : 17-40. doi: 10.3934/dcdsb.2010.14.17

[8]

Mauro Maggioni, James M. Murphy. Learning by active nonlinear diffusion. Foundations of Data Science, 2019, 1 (3) : 271-291. doi: 10.3934/fods.2019012

[9]

D. Warren, K Najarian. Learning theory applied to Sigmoid network classification of protein biological function using primary protein structure. Conference Publications, 2003, 2003 (Special) : 898-904. doi: 10.3934/proc.2003.2003.898

[10]

Minlong Lin, Ke Tang. Selective further learning of hybrid ensemble for class imbalanced increment learning. Big Data & Information Analytics, 2017, 2 (1) : 1-21. doi: 10.3934/bdia.2017005

[11]

Yang Wang, Zhengfang Zhou. Source extraction in audio via background learning. Inverse Problems & Imaging, 2013, 7 (1) : 283-290. doi: 10.3934/ipi.2013.7.283

[12]

Wei Xue, Wensheng Zhang, Gaohang Yu. Least absolute deviations learning of multiple tasks. Journal of Industrial & Management Optimization, 2018, 14 (2) : 719-729. doi: 10.3934/jimo.2017071

[13]

Miguel A. Dumett, Roberto Cominetti. On the stability of an adaptive learning dynamics in traffic games. Journal of Dynamics & Games, 2018, 5 (4) : 265-282. doi: 10.3934/jdg.2018017

[14]

Mikhail Langovoy, Akhilesh Gotmare, Martin Jaggi. Unsupervised robust nonparametric learning of hidden community properties. Mathematical Foundations of Computing, 2019, 2 (2) : 127-147. doi: 10.3934/mfc.2019010

[15]

Émilie Chouzenoux, Henri Gérard, Jean-Christophe Pesquet. General risk measures for robust machine learning. Foundations of Data Science, 2019, 1 (3) : 249-269. doi: 10.3934/fods.2019011

[16]

Tieliang Gong, Qian Zhao, Deyu Meng, Zongben Xu. Why curriculum learning & self-paced learning work in big/noisy data: A theoretical perspective. Big Data & Information Analytics, 2016, 1 (1) : 111-127. doi: 10.3934/bdia.2016.1.111

[17]

Shishun Li, Zhengda Huang. Guaranteed descent conjugate gradient methods with modified secant condition. Journal of Industrial & Management Optimization, 2008, 4 (4) : 739-755. doi: 10.3934/jimo.2008.4.739

[18]

Wataru Nakamura, Yasushi Narushima, Hiroshi Yabe. Nonlinear conjugate gradient methods with sufficient descent properties for unconstrained optimization. Journal of Industrial & Management Optimization, 2013, 9 (3) : 595-619. doi: 10.3934/jimo.2013.9.595

[19]

Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031

[20]

Ta-Wei Hung, Ping-Ting Chen. On the optimal replenishment in a finite planning horizon with learning effect of setup costs. Journal of Industrial & Management Optimization, 2010, 6 (2) : 425-433. doi: 10.3934/jimo.2010.6.425

 Impact Factor: 

Metrics

  • PDF downloads (63)
  • HTML views (466)
  • Cited by (0)

Other articles
by authors

[Back to Top]