American Institute of Mathematical Sciences

doi: 10.3934/dcdss.2021084
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

Bayesian topological signal processing

 1 University of Notre Dame, Department of Aerospace and Mechanical Engineering, Fitzpatrick Hall of Engineering and Cushing Hall, 112 N Notre Dame Ave, Notre Dame, IN 46556 2 University of Tennessee, Department of Mathematics, 1403 Circle Drive, Knoxville, TN 37996-1320 3 US Army Research Laboratory, 7101 Mulberry Point Road, Bldg. 459, Aberdeen Proving Ground, MD 21005-5425

Received  January 2021 Revised  April 2021 Early access July 2021

Topological data analysis encompasses a broad set of techniques that investigate the shape of data. One of the predominant tools in topological data analysis is persistent homology, which is used to create topological summaries of data called persistence diagrams. Persistent homology offers a novel method for signal analysis. Herein, we aid interpretation of the sublevel set persistence diagrams of signals by 1) showing the effect of frequency and instantaneous amplitude on the persistence diagrams for a family of deterministic signals, and 2) providing a general equation for the probability density of persistence diagrams of random signals via a pushforward measure. We also provide a topologically-motivated, efficiently computable statistical descriptor analogous to the power spectral density for signals based on a generalized Bayesian framework for persistence diagrams. This Bayesian descriptor is shown to be competitive with power spectral densities and continuous wavelet transforms at distinguishing signals with different dynamics in a classification problem with autoregressive signals.

Citation: Christopher Oballe, Alan Cherne, Dave Boothe, Scott Kerick, Piotr J. Franaszczuk, Vasileios Maroulas. Bayesian topological signal processing. Discrete & Continuous Dynamical Systems - S, doi: 10.3934/dcdss.2021084
References:

show all references

References:
Shown above (a) are the sublevel sets $C_{-0.5}$, $C_{0}$, $C_{0.25}$, and $C_1$ for a damped cosine $e^{-2t}\cos(8\pi t)$. (b) shows the persistence diagram of the sublevel set filtration. The points in (b) are colored to match the connected components their birth coordinates correspond to. The transition from $C_0$ to $C_{0.25}$ depicts the Elder rule; notice that in $C_0$, there are light blue and purple connected components, which merge together in $C_{0.25}$. A similar merging happens in the transition from $C_{0.25}$ to $C_{0.5}$. Since the purple component has a later birth value, it disappears into the light blue component, which persists until it merges into the green component by the same line of reasoning.
This figure illustrates sources of uncertainty in persistence diagrams. Shown above are signals with additive noise (a) $\mathcal{N}(0,0.01)$, and (b) $\mathcal{N}(0,0.1)$ along with their persistence diagrams. The persistence diagram for the true underlying signal is shown in red. Spurious features arise due to noise and additionally, true features also shift around.
Top: We consider three signals. The blue signal (Signal 1) and the red signal (Signal 2) are modeled by $a_{\beta}(t)\cos(8\pi t)$ where $a_{\beta}(t) = 5e^{-{\beta}t}$ with ${\beta} = 1,4$ in Signals 1 and 2 respectively. The green signal (Signal 3) is then added to each case and the amplitudes are translated to have global minima equal to zero. Bottom: The associated persistence diagrams are plotted using the method described in Section 2.2. We observe that as ${\beta}$ increases, the high-frequency oscillations are less affected by the low-frequency signal and converge faster towards the uniform shape of the green signal. This leads to a decrease in the variance of the persistence coordinates in the red diagram.
(a) The damped cosine $e^{-2t}\cos(8\pi t)$ with additive noise $\mathcal{N}(0,0.01)$ and (b) its persistence diagram. (b) shows an uninformative prior intensity with a single component at $(1,1)$ with covariance matrix $10I$. Using the model from Equation (7) with the prior in (c) and the observed diagram in (b) results in the posterior intensity shown in (d). To account for spurious points, which we suspected to be low persistence in this example, we placed components of $\lambda_{S}$ at $(0.5,0.1), (1,0.1),(0.75,0.1)$ and $(1.75,0.1)$.
This figure demonstrates the effect of greater low frequency power on the persistence diagram of a signal. Figures (a) and (c) show two signals, respectively, which are the result of summing low-frequency and high-frequency oscillators. The power of the low-frequency signal is greater in (a) than in (c). To ensure that persistence diagrams in (b) and (d) lie in $\mathbb{W}$, the aggregate signals in both (a) and (c) have been translated so that their absolute minima are at zero. Notice in (b) that elements of the persistence diagram show greater spread along the Birth axis than in (d). This results in greater birth variance of the corresponding posterior intensity. Also notice the isolated high-persistence mode in (b), which is not present in (d). These phenomena arise because the low frequency signal scatters the higher frequency peaks along the Amplitude axis.
This plot depicts the relationship between the cardinality of persistence diagrams and the frequency of the dominant oscillation for one second autoregressive signals across various damping factors. For each included frequency and damping factor, we simulated thirty signals (each had a component fixed at zero to give the $1/f$ PSD commonly seen in EEG), computed their persistence diagrams, then recorded their average cardinality. We see a strong positive correlation between this average cardinality and the frequency of the dominant oscillation (i.e., PSD Peak Frequency) consistent with the idealized deterministic sinusoid case.
The peak frequency $f$ for $\mathcal{A}_{f}^{\beta}$ plotted against the average birth variance for its persistence diagrams. Colors depict the damping factor $\beta$.
The average (log) power spectral densities along with examples of signals and persistence diagrams from each class for damping factors of top) 4, and bottom) 32
Parameter values for autoregressive model determined by fitting to real EEG. Missing values indicate that the optimal AR model order did not include a corresponding frequency component
 Signal Length 1 Second 5 Seconds $f_1$ $f_2$ $f_3$ $f_4$ $\beta_1$ $\beta_2$ $\beta_3$ $\beta_4$ $f_1$ $f_2$ $f_3$ $f_4$ $\beta_1$ $\beta_2$ $\beta_3$ $\beta_4$ Signal 1 0 5.87 18.59 - 344.80 5.37 16.6 - 0 6.00 14.4 20.85 24.98 10.54 31.64 26.97 Signal 2 0 10.70 - - 202.78 7.41 - - 0 10.16 23.02 - 17.24 4.06 20.37 -
 Signal Length 1 Second 5 Seconds $f_1$ $f_2$ $f_3$ $f_4$ $\beta_1$ $\beta_2$ $\beta_3$ $\beta_4$ $f_1$ $f_2$ $f_3$ $f_4$ $\beta_1$ $\beta_2$ $\beta_3$ $\beta_4$ Signal 1 0 5.87 18.59 - 344.80 5.37 16.6 - 0 6.00 14.4 20.85 24.98 10.54 31.64 26.97 Signal 2 0 10.70 - - 202.78 7.41 - - 0 10.16 23.02 - 17.24 4.06 20.37 -
Precisions and recalls for each feature and classifier. Results are reported as mean $\pm$ standard error across each class
 Bayesian PSD CWT Classifier Precision Recall Precision Recall Precision Recall LR $0.84 \pm 0.06$ $0.85 \pm 0.07$ $0.90 \pm 0.04$ $0.90 \pm 0.04$ $0.91 \pm 0.03$ $0.90 \pm 0.04$ SVM - Lin. $0.92 \pm 0.05$ $0.91 \pm 0.04$ $0.91 \pm 0.03$ $0.90 \pm 0.05$ $0.91 \pm 0.04$ $0.91 \pm 0.03$ MLP $0.89 \pm 0.05$ $0.88 \pm 0.04$ $0.90 \pm 0.02$ $0.89 \pm 0.02$ $0.92 \pm 0.03$ $0.93 \pm 0.02$
 Bayesian PSD CWT Classifier Precision Recall Precision Recall Precision Recall LR $0.84 \pm 0.06$ $0.85 \pm 0.07$ $0.90 \pm 0.04$ $0.90 \pm 0.04$ $0.91 \pm 0.03$ $0.90 \pm 0.04$ SVM - Lin. $0.92 \pm 0.05$ $0.91 \pm 0.04$ $0.91 \pm 0.03$ $0.90 \pm 0.05$ $0.91 \pm 0.04$ $0.91 \pm 0.03$ MLP $0.89 \pm 0.05$ $0.88 \pm 0.04$ $0.90 \pm 0.02$ $0.89 \pm 0.02$ $0.92 \pm 0.03$ $0.93 \pm 0.02$
 [1] Marc Bocquet, Julien Brajard, Alberto Carrassi, Laurent Bertino. Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization. Foundations of Data Science, 2020, 2 (1) : 55-80. doi: 10.3934/fods.2020004 [2] Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031 [3] Andreas Chirstmann, Qiang Wu, Ding-Xuan Zhou. Preface to the special issue on analysis in machine learning and data science. Communications on Pure & Applied Analysis, 2020, 19 (8) : i-iii. doi: 10.3934/cpaa.2020171 [4] Xingong Zhang. Single machine and flowshop scheduling problems with sum-of-processing time based learning phenomenon. Journal of Industrial & Management Optimization, 2020, 16 (1) : 231-244. doi: 10.3934/jimo.2018148 [5] Alan Beggs. Learning in monotone bayesian games. Journal of Dynamics & Games, 2015, 2 (2) : 117-140. doi: 10.3934/jdg.2015.2.117 [6] George Siopsis. Quantum topological data analysis with continuous variables. Foundations of Data Science, 2019, 1 (4) : 419-431. doi: 10.3934/fods.2019017 [7] Tyrus Berry, Timothy Sauer. Consistent manifold representation for topological data analysis. Foundations of Data Science, 2019, 1 (1) : 1-38. doi: 10.3934/fods.2019001 [8] Jiping Tao, Zhijun Chao, Yugeng Xi. A semi-online algorithm and its competitive analysis for a single machine scheduling problem with bounded processing times. Journal of Industrial & Management Optimization, 2010, 6 (2) : 269-282. doi: 10.3934/jimo.2010.6.269 [9] Yuanjia Ma. The optimization algorithm for blind processing of high frequency signal of capacitive sensor. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1399-1412. doi: 10.3934/dcdss.2019096 [10] Émilie Chouzenoux, Henri Gérard, Jean-Christophe Pesquet. General risk measures for robust machine learning. Foundations of Data Science, 2019, 1 (3) : 249-269. doi: 10.3934/fods.2019011 [11] Ana Rita Nogueira, João Gama, Carlos Abreu Ferreira. Causal discovery in machine learning: Theories and applications. Journal of Dynamics & Games, 2021, 8 (3) : 203-231. doi: 10.3934/jdg.2021008 [12] Robert D. Sidman, Marie Erie, Henry Chu. A method, with applications, for analyzing co-registered EEG and MRI data. Conference Publications, 2001, 2001 (Special) : 349-356. doi: 10.3934/proc.2001.2001.349 [13] Dominique Duncan, Ronen Talmon, Hitten P. Zaveri, Ronald R. Coifman. Identifying preseizure state in intracranial EEG data using diffusion kernels. Mathematical Biosciences & Engineering, 2013, 10 (3) : 579-590. doi: 10.3934/mbe.2013.10.579 [14] Matthew M. Dunlop, Andrew M. Stuart. The Bayesian formulation of EIT: Analysis and algorithms. Inverse Problems & Imaging, 2016, 10 (4) : 1007-1036. doi: 10.3934/ipi.2016030 [15] Hamed Azizollahi, Marion Darbas, Mohamadou M. Diallo, Abdellatif El Badia, Stephanie Lohrengel. EEG in neonates: Forward modeling and sensitivity analysis with respect to variations of the conductivity. Mathematical Biosciences & Engineering, 2018, 15 (4) : 905-932. doi: 10.3934/mbe.2018041 [16] Roberto C. Alamino, Nestor Caticha. Bayesian online algorithms for learning in discrete hidden Markov models. Discrete & Continuous Dynamical Systems - B, 2008, 9 (1) : 1-10. doi: 10.3934/dcdsb.2008.9.1 [17] Govinda Anantha Padmanabha, Nicholas Zabaras. A Bayesian multiscale deep learning framework for flows in random media. Foundations of Data Science, 2021, 3 (2) : 251-303. doi: 10.3934/fods.2021016 [18] Ji-Bo Wang, Bo Zhang, Hongyu He. A unified analysis for scheduling problems with variable processing times. Journal of Industrial & Management Optimization, 2021  doi: 10.3934/jimo.2021008 [19] Mingbao Cheng, Shuxian Xiao, Guosheng Liu. Single-machine rescheduling problems with learning effect under disruptions. Journal of Industrial & Management Optimization, 2018, 14 (3) : 967-980. doi: 10.3934/jimo.2017085 [20] Mustaffa Alfatlawi, Vaibhav Srivastava. An incremental approach to online dynamic mode decomposition for time-varying systems with applications to EEG data modeling. Journal of Computational Dynamics, 2020, 7 (2) : 209-241. doi: 10.3934/jcd.2020009

2020 Impact Factor: 2.425

Tools

Article outline

Figures and Tables