Foundations of Data Science
March 2020 , Volume 2 , Issue 1
Select all articles
We propose a stochastic gradient descent based optimization algorithm to solve the analytic continuation problem in which we extract real frequency spectra from imaginary time Quantum Monte Carlo data. The procedure of analytic continuation is an ill-posed inverse problem which is usually solved by regularized optimization methods, such like the Maximum Entropy method, or stochastic optimization methods. The main contribution of this work is to improve the performance of stochastic optimization approaches by introducing a supervised stochastic gradient descent algorithm to solve a flipped inverse system which processes the random solutions obtained by a type of Fast and Efficient Stochastic Optimization Method.
Classification tasks based on feature vectors can be significantly improved by including within deep learning a graph that summarises pairwise relationships between the samples. Intuitively, the graph acts as a conduit to channel and bias the inference of class labels. Here, we study classification methods that consider the graph as the originator of an explicit graph diffusion. We show that appending graph diffusion to feature-based learning as an a posteriori refinement achieves state-of-the-art classification accuracy. This method, which we call Graph Diffusion Reclassification (GDR), uses overshooting events of a diffusive graph dynamics to reclassify individual nodes. The method uses intrinsic measures of node influence, which are distinct for each node, and allows the evaluation of the relationship and importance of features and graph for classification. We also present diff-GCN, a simple extension of Graph Convolutional Neural Network (GCN) architectures that leverages explicit diffusion dynamics, and allows the natural use of directed graphs. To showcase our methods, we use benchmark datasets of documents with associated citation data.
In this article we consider Bayesian inference for partially observed Andersson-Madigan-Perlman (AMP) Gaussian chain graph (CG) models. Such models are of particular interest in applications such as biological networks and financial time series. The model itself features a variety of constraints which make both prior modeling and computational inference challenging. We develop a framework for the aforementioned challenges, using a sequential Monte Carlo (SMC) method for statistical inference. Our approach is illustrated on both simulated data as well as real case studies from university graduation rates and a pharmacokinetics study.
The reconstruction from observations of high-dimensional chaotic dynamics such as geophysical flows is hampered by (ⅰ) the partial and noisy observations that can realistically be obtained, (ⅱ) the need to learn from long time series of data, and (ⅲ) the unstable nature of the dynamics. To achieve such inference from the observations over long time series, it has been suggested to combine data assimilation and machine learning in several ways. We show how to unify these approaches from a Bayesian perspective using expectation-maximization and coordinate descents. In doing so, the model, the state trajectory and model error statistics are estimated all together. Implementations and approximations of these methods are discussed. Finally, we numerically and successfully test the approach on two relevant low-order chaotic models with distinct identifiability.
We as authors of paper [
Add your name and e-mail address to receive news of forthcoming issues of this journal:
[Back to Top]