Communications on Pure and Applied Analysis
August 2020 , Volume 19 , Issue 8
Select all articles
This paper is concerned with learning rates for partial linear functional models (PLFM) within reproducing kernel Hilbert spaces (RKHS), where all the covariates consist of two parts: functional-type covariates and scalar ones. As opposed to frequently used functional principal component analysis for functional models, the finite number of basis functions in the proposed approach can be generated automatically by taking advantage of reproducing property of RKHS. This avoids additional computational costs on PCA decomposition and the choice of the number of principal components. Moreover, the coefficient estimators with bounded covariates converge to the true coefficients with linear rates, as if the functional term in PLFM has no effect on the linear part. In contrast, the prediction error for the functional estimator is significantly affected by the ambient dimension of the scalar covariates. Finally, we develop the proposed numerical algorithm for the proposed penalized approach, and some simulated experiments are implemented to support our theoretical results.
Canonical correlation analysis (CCA) is a powerful statistical tool for detecting mutual information between two sets of multi-dimensional random variables. Unlike CCA, Generalized CCA (GCCA), a natural extension of CCA, could detect the relations of multiple datasets (more than two). To interpret canonical variates more efficiently, this paper addresses a novel sparse GCCA algorithm via linearized Bregman method, which is a generalization of traditional sparse CCA methods. Experimental results on both synthetic dataset and real datasets demonstrate the effectiveness and efficiency of the proposed algorithm when compared with several state-of-the-art sparse CCA and deep CCA algorithms.
The huge amount of available data nowadays is a challenge for kernel-based machine learning algorithms like SVMs with respect to runtime and storage capacities. Local approaches might help to relieve these issues and to improve statistical accuracy. It has already been shown that these local approaches are consistent and robust in a basic sense. This article refines the analysis of robustness properties towards the so-called influence function which expresses the differentiability of the learning method: We show that there is a differentiable dependency of our locally learned predictor on the underlying distribution. The assumptions of the proven theorems can be verified without knowing anything about this distribution. This makes the results interesting also from an applied point of view.
We study the efficiency of the approximation of the functions from the Besov space
We consider learning rates of kernel regularized regression (KRR) based on reproducing kernel Hilbert spaces (RKHSs) and differentiable strongly convex losses and provide some new strongly convex losses. We first show the robustness with the maximum mean discrepancy (MMD) and the Hutchinson metric respectively, and, along this line, bound the learning rate of the KRR. We first provide a capacity dependent learning rate and then give the learning rates for four concrete strongly convex losses respectively. In particular, we provide the learning rates when the hypothesis RKHS's logarithmic complexity exponent is arbitrarily small as well as sufficiently large.
The perfect achievements have been made for
Recent investigations on the error analysis of kernel regularized pairwise learning initiate the theoretical research on pairwise reproducing kernel Hilbert spaces (PRKHSs). In the present paper, we provide a method of constructing PRKHSs with classical Jacobi orthogonal polynomials. The performance of the kernel regularized online pairwise regression learning algorithms based on a quadratic loss function is investigated. Applying convex analysis and Rademacher complexity techniques, the bounds for the generalization error are provided explicitly. It is shown that the convergence rate can be greatly improved by adjusting the scale parameters in the loss function.
This paper considers recovery of signals that are corrupted with noise. We focus on a novel model which is called relaxed ALASSO (RALASSO) model introduced by Z. Tan et al. (2014). Compared to the well-known ALASSO, RALASSO can be solved better in practice. Z. Tan et al. (2014) used the
High-dimensional binary classification has been intensively studied in the community of machine learning in the last few decades. Support vector machine (SVM), one of the most popular classifier, depends on only a portion of training samples called support vectors which leads to suboptimal performance in the setting of high dimension and low sample size (HDLSS). Large-margin unified machines (LUMs) are a family of margin-based classifiers proposed to solve the so-called "data piling" problem which is inherent in SVM under HDLSS settings. In this paper we study the binary classification algorithms associated with LUM loss functions in the framework of reproducing kernel Hilbert spaces. Quantitative convergence analysis has been carried out for these algorithms by means of a novel application of projection operators to overcome the technical difficulty. The rates are explicitly derived under priori conditions on approximation and capacity of the reproducing kernel Hilbert space.
We show that deep networks are better than shallow networks at approximating functions that can be expressed as a composition of functions described by a directed acyclic graph, because the deep networks can be designed to have the same compositional structure, while a shallow network cannot exploit this knowledge. Thus, the blessing of compositionality mitigates the curse of dimensionality. On the other hand, a theorem called good propagation of errors allows to "lift" theorems about shallow networks to those about deep networks with an appropriate choice of norms, smoothness, etc. We illustrate this in three contexts where each channel in the deep network calculates a spherical polynomial, a non-smooth ReLU network, or another zonal function network related closely with the ReLU network.
Inverses of certain positive linear operators have been investigated in several recent papers, in connection with problems like decomposition of classical operators, representation of Lagrange-type operators, asymptotic formulas of Voronovskaja type. Motivated by such researches, in this paper we give some representations for the inverses of certain positive linear operators, as Bernstein, Beta, Bernstein - Durrmeyer, genuine Bernstein - Durrmeyer and Kantorovich operators. Moreover, some Voronovskaja type formulas for the inverses of these operators are obtained. Several techniques are used in order to get such results.
In this paper, we consider the nonlinear ill-posed inverse problem with noisy data in the statistical learning setting. The Tikhonov regularization scheme in Hilbert scales is considered to reconstruct the estimator from the random noisy data. In this statistical learning setting, we derive the rates of convergence for the regularized solution under certain assumptions on the nonlinear forward operator and the prior assumptions. We discuss estimates of the reconstruction error using the approach of reproducing kernel Hilbert spaces.
The aim of this paper is twofold. Firstly, we derive an explicit expression of the (theoretical) solutions of stochastic differential equations with affine coefficients driven by
The use of sampling methods in computing eigenpairs of two-parameter boundary value problems is extremely rare. As far as we know, there are only two studies up to now using the bivariate version of the classical and regularized sampling series. These series have a slow convergence rate. In this paper, we use the bivariate sinc-Gauss sampling formula that was proposed in [
In this paper, we study the convergence of the gradient descent method for the maximum correntropy criterion (MCC) associated with reproducing kernel Hilbert spaces (RKHSs). MCC is widely used in many real-world applications because of its robustness and ability to deal with non-Gaussian impulse noises. In the regression context, we show that the gradient descent iterates of MCC can approximate the target function and derive the capacity-dependent convergence rate by taking a suitable iteration number. Our result can nearly match the optimal convergence rate stated in the previous work, and in which we can see that the scaling parameter is crucial to MCC's approximation ability and robustness property. The novelty of our work lies in a sharp estimate for the norms of the gradient descent iterates and the projection operation on the last iterate.
Negative binomial regression has been widely applied in various research settings to account for counts with overdispersion. Yet, when the gamma scale parameter,
Recently, there is considerable work on developing efficient stochastic optimization algorithms for AUC maximization. However, most of them focus on the least square loss which may be not the best option in practice. The main difficulty for dealing with the general convex loss is the pairwise nonlinearity w.r.t. the sampling distribution generating the data. In this paper, we use Bernstein polynomials to uniformly approximate the general losses which are able to decouple the pairwise nonlinearity. In particular, we show that this reduction for AUC maximization with a general loss is equivalent to a weakly convex (nonconvex) min-max formulation. Then, we develop a novel SGD algorithm for AUC maximization with per-iteration cost linearly w.r.t. the data dimension, making it amenable for streaming data analysis. Despite its non-convexity, we prove its global convergence by exploring the appealing convexity-preserving property of Bernstein polynomials and the intrinsic structure of the min-max formulation. Experiments are performed to validate the effectiveness of the proposed approach.
In a recent paper, for univariate max-product sampling operators based on general kernels with bounded generalized absolute moments, we have obtained several
Add your name and e-mail address to receive news of forthcoming issues of this journal:
[Back to Top]