All Issues

Volume 1, 2019

Foundations of Data Science

March 2020 , Volume 2 , Issue 1

Select all articles


Stochastic gradient descent algorithm for stochastic optimization in solving analytic continuation problems
Feng Bao and Thomas Maier
2020, 2(1): 1-17 doi: 10.3934/fods.2020001 +[Abstract](203) +[HTML](156) +[PDF](418.16KB)

We propose a stochastic gradient descent based optimization algorithm to solve the analytic continuation problem in which we extract real frequency spectra from imaginary time Quantum Monte Carlo data. The procedure of analytic continuation is an ill-posed inverse problem which is usually solved by regularized optimization methods, such like the Maximum Entropy method, or stochastic optimization methods. The main contribution of this work is to improve the performance of stochastic optimization approaches by introducing a supervised stochastic gradient descent algorithm to solve a flipped inverse system which processes the random solutions obtained by a type of Fast and Efficient Stochastic Optimization Method.

Semi-supervised classification on graphs using explicit diffusion dynamics
Robert L. Peach, Alexis Arnaudon and Mauricio Barahona
2020, 2(1): 19-33 doi: 10.3934/fods.2020002 +[Abstract](247) +[HTML](138) +[PDF](347.25KB)

Classification tasks based on feature vectors can be significantly improved by including within deep learning a graph that summarises pairwise relationships between the samples. Intuitively, the graph acts as a conduit to channel and bias the inference of class labels. Here, we study classification methods that consider the graph as the originator of an explicit graph diffusion. We show that appending graph diffusion to feature-based learning as an a posteriori refinement achieves state-of-the-art classification accuracy. This method, which we call Graph Diffusion Reclassification (GDR), uses overshooting events of a diffusive graph dynamics to reclassify individual nodes. The method uses intrinsic measures of node influence, which are distinct for each node, and allows the evaluation of the relationship and importance of features and graph for classification. We also present diff-GCN, a simple extension of Graph Convolutional Neural Network (GCN) architectures that leverages explicit diffusion dynamics, and allows the natural use of directed graphs. To showcase our methods, we use benchmark datasets of documents with associated citation data.

Bayesian inference for latent chain graphs
Deng Lu, Maria De Iorio, Ajay Jasra and Gary L. Rosner
2020, 2(1): 35-54 doi: 10.3934/fods.2020003 +[Abstract](69) +[HTML](55) +[PDF](580.84KB)

In this article we consider Bayesian inference for partially observed Andersson-Madigan-Perlman (AMP) Gaussian chain graph (CG) models. Such models are of particular interest in applications such as biological networks and financial time series. The model itself features a variety of constraints which make both prior modeling and computational inference challenging. We develop a framework for the aforementioned challenges, using a sequential Monte Carlo (SMC) method for statistical inference. Our approach is illustrated on both simulated data as well as real case studies from university graduation rates and a pharmacokinetics study.

Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization
Marc Bocquet, Julien Brajard, Alberto Carrassi and Laurent Bertino
2020, 2(1): 55-80 doi: 10.3934/fods.2020004 +[Abstract](163) +[HTML](67) +[PDF](800.13KB)

The reconstruction from observations of high-dimensional chaotic dynamics such as geophysical flows is hampered by (ⅰ) the partial and noisy observations that can realistically be obtained, (ⅱ) the need to learn from long time series of data, and (ⅲ) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has been suggested to combine data assimilation and machine learning in several ways. We show how to unify these approaches from a Bayesian perspective using expectation-maximization and coordinate descents. In doing so, the model, the state trajectory and model error statistics are estimated all together. Implementations and approximations of these methods are discussed. Finally, we numerically and successfully test the approach on two relevant low-order chaotic models with distinct identifiability.

Corrigendum to "Cluster, classify, regress: A general method for learning discontinuous functions [1]"
David E. Bernholdt, Mark R. Cianciosa, Clement Etienam, David L. Green, Kody J. H. Law and Jin M. Park
2020, 2(1): 81-81 doi: 10.3934/fods.2020005 +[Abstract](74) +[HTML](44) +[PDF](145.16KB)

We as authors of paper [1] wish to correct the order of all authors to alphabetical order according the authors' last names.




Email Alert

[Back to Top]