# American Institute of Mathematical Sciences

eISSN:
2639-8001

All Issues

## Foundations of Data Science

2020 , Volume 2 , Issue 3

Select all articles

Export/Reference:

2020, 2(3): 207-255 doi: 10.3934/fods.2020011 +[Abstract](659) +[HTML](320) +[PDF](2057.33KB)
Abstract:

In this paper we are concerned with the learnability of energies from data obtained by observing time evolutions of their critical points starting at random initial equilibria. As a byproduct of our theoretical framework we introduce the novel concept of mean-field limit of critical point evolutions and of their energy balance as a new form of transport. We formulate the energy learning as a variational problem, minimizing the discrepancy of energy competitors from fulfilling the equilibrium condition along any trajectory of critical points originated at random initial equilibria. By \begin{document}$\Gamma$\end{document}-convergence arguments we prove the convergence of minimal solutions obtained from finite number of observations to the exact energy in a suitable sense. The abstract framework is actually fully constructive and numerically implementable. Hence, the approximation of the energy from a finite number of observations of past evolutions allows one to simulate further evolutions, which are fully data-driven. As we aim at a precise quantitative analysis, and to provide concrete examples of tractable solutions, we present analytic and numerical results on the reconstruction of an elastic energy for a one-dimensional model of thin nonlinear-elastic rod.

2020, 2(3): 257-278 doi: 10.3934/fods.2020012 +[Abstract](778) +[HTML](312) +[PDF](2156.53KB)
Abstract:

Recently, neural networks (NN) with an infinite number of layers have been introduced. Especially for these very large NN the training procedure is very expensive. Hence, there is interest to study their robustness with respect to input data to avoid unnecessarily retraining the network.

Typically, model-based statistical inference methods, e.g. Bayesian neural networks, are used to quantify uncertainties. Here, we consider a special class of residual neural networks and we study the case, when the number of layers can be arbitrarily large. Then, kinetic theory allows to interpret the network as a dynamical system, described by a partial differential equation. We study the robustness of the mean-field neural network with respect to perturbations in initial data by applying UQ approaches on the loss functions.

2020, 2(3): 279-307 doi: 10.3934/fods.2020013 +[Abstract](1031) +[HTML](536) +[PDF](722.55KB)
Abstract:

This paper presents novel mathematical results in support of the probabilistic learning on manifolds (PLoM) recently introduced by the authors. An initial dataset, constituted of a small number of points given in an Euclidean space, is given. The points are independent realizations of a vector-valued random variable for which its non-Gaussian probability measure is unknown but is, a priori, concentrated in an unknown subset of the Euclidean space. A learned dataset, constituted of additional realizations, is constructed. A transport of the probability measure estimated with the initial dataset is done through a linear transformation constructed using a reduced-order diffusion-maps basis. It is proven that this transported measure is a marginal distribution of the invariant measure of a reduced-order Itô stochastic differential equation. The concentration of the probability measure is preserved. This property is shown by analyzing a distance between the random matrix constructed with the PLoM and the matrix representing the initial dataset, as a function of the dimension of the basis. It is further proven that this distance has a minimum for a dimension of the reduced-order diffusion-maps basis that is strictly smaller than the number of points in the initial dataset.

2020, 2(3): 309-332 doi: 10.3934/fods.2020014 +[Abstract](566) +[HTML](270) +[PDF](800.18KB)
Abstract:

The supervised learning problem to determine a neural network approximation \begin{document}$\mathbb{R}^d\ni x\mapsto\sum_{k = 1}^K\hat\beta_k e^{{{\mathrm{i}}}\omega_k\cdot x}$\end{document} with one hidden layer is studied as a random Fourier features algorithm. The Fourier features, i.e., the frequencies \begin{document}$\omega_k\in\mathbb{R}^d$\end{document}, are sampled using an adaptive Metropolis sampler. The Metropolis test accepts proposal frequencies \begin{document}$\omega_k'$\end{document}, having corresponding amplitudes \begin{document}$\hat\beta_k'$\end{document}, with the probability \begin{document}$\min\big\{1, (|\hat\beta_k'|/|\hat\beta_k|)^{\gamma}\big\}$\end{document}, for a certain positive parameter \begin{document}${\gamma}$\end{document}, determined by minimizing the approximation error for given computational work. This adaptive, non-parametric stochastic method leads asymptotically, as \begin{document}$K\to\infty$\end{document}, to equidistributed amplitudes \begin{document}$|\hat\beta_k|$\end{document}, analogous to deterministic adaptive algorithms for differential equations. The equidistributed amplitudes are shown to asymptotically correspond to the optimal density for independent samples in random Fourier features methods. Numerical evidence is provided in order to demonstrate the approximation properties and efficiency of the proposed algorithm. The algorithm is tested both on synthetic data and a real-world high-dimensional benchmark.

2020, 2(3): 333-349 doi: 10.3934/fods.2020016 +[Abstract](357) +[HTML](217) +[PDF](485.82KB)
Abstract:

We consider a combined state and drift estimation problem for the linear stochastic heat equation. The infinite-dimensional Bayesian inference problem is formulated in terms of the Kalman–Bucy filter over an extended state space, and its long-time asymptotic properties are studied. Asymptotic posterior contraction rates in the unknown drift function are the main contribution of this paper. Such rates have been studied before for stationary non-parametric Bayesian inverse problems, and here we demonstrate the consistency of our time-dependent formulation with these previous results building upon scale separation and a slow manifold approximation.