Advanced Search
Article Contents
Article Contents

ANAPT: Additive noise analysis for persistence thresholding

  • * Corresponding author: Audun D. Myers

    * Corresponding author: Audun D. Myers 

The first author is supported by NSF grants CMMI-1759823 and DMS-1759824

Abstract Full Text(HTML) Figure(13) / Table(3) Related Papers Cited by
  • We introduce a novel method for Additive Noise Analysis for Persistence Thresholding (ANAPT) which separates significant features in the sublevel set persistence diagram of a time series based on a statistics analysis of the persistence of a noise distribution. Specifically, we consider an additive noise model and leverage the statistical analysis to provide a noise cutoff or confidence interval in the persistence diagram for the observed time series. This analysis is done for several common noise models including Gaussian, uniform, exponential, and Rayleigh distributions. ANAPT is computationally efficient, does not require any signal pre-filtering, is widely applicable, and has open-source software available. We demonstrate the functionality of ANAPT with both numerically simulated examples and an experimental data set. Additionally, we provide an efficient $ \Theta(n\log(n)) $ algorithm for calculating the zero-dimensional sublevel set persistence homology.

    Mathematics Subject Classification: Primary: 55-08.


    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  Persistence diagram summarizing the sublevel sets of a function $ f(t) $ over finite domain $ t \in[t_a, t_b] $. This function has two local minima (blue squares) and two local maxima (red circles

    Figure 2.  Sublevel set persistence applied to $ x(t) $ of a single variable function or time series with and without additive noise $ \epsilon $ from $ \mathcal{N} $, shown in red and blue, respectively. This demonstrates the stability of persistent homology with the time series (left) with and without additive noise and the small effect on the resulting persistence diagrams (right). In addition, the light red region separates the significant features from those associated to additive noise

    Figure 3.  Histograms $ h(*) $ of the zero mean normal distriubtion $ \mathcal{N}(0,\sigma^2 =1) $ and the resulting birth times $ B $ and death times $ D $, which are compared to the density distributions from Equation(5)

    Figure 4.  Additive noise probability distributions $ f(z) $ for the four models realized in this work: uniform, Gaussian, Rayleigh, and exponential

    Figure 5.  Example time series showing sample $ \delta_i $

    Figure 6.  Numeric function fitting of Equationk__ge (52) to the mean of the median lifetime $ \tilde{L} $ of $ f_i(t) $ for $ i \in [1,3] $ where $ \mathcal{N} $ is unit variance Gaussian additive noise with $ \delta \in [0,2] $ being incremented to understand the effects of signal on the median lifetime

    Figure 7.  Demonstration of distribution parameter $ \sigma $ estimation of Gaussian additive noise in $ x(t) = A\sin(\pi t) + \mathcal{N} $ using the median lifetime with and without signal compensation as $ \sigma $ and $ \sigma^* $, respectively

    Figure 8.  Comparison between the persistent homology, bootstrap resampling, and ANAPT methods for separating noise from topological signal in the sublevel set persistence diagram. The example signal is generated from a chaotic Lorenz system

    Figure 9.  Comparison of distribution parameter estimation techniques (low-Pass filter residuals, cubic spline residuals, and sublevel set persistence) for estimating $ \sigma $ of the additive Gaussian noise contaminating the $ x(t) $ solution of the Lorenz system in Eq.55

    Figure 10.  Numerical simulations with $ 10^5 $ data points for each of the four investigated noise models with probability densities on the top row and lifetime probability densities on the bottom row with their associated cutoff $ C_\alpha $. The distribution parameters were set as $ \sigma = \Delta = \lambda = \sigma = 1 $ and were estimated as $ \sigma \approx 0.996 $, $ \Delta \approx 1.013 $, $ \lambda \approx 1.009 $, and $ \sigma \approx 0.998 $ for the Gaussian, uniform, Rayleigh, and exponential distributions, respectively

    Figure 11.  Example cutoff calculation for time series $ x(t) = A(\sin(\pi t) + \sin(t)) + \epsilon $, where $ \epsilon $ is additive noise from a Gaussian distribution with zero mean and four different standard deviations as $ \sigma \in [0.2, 1, 2, 4] $. The resulting sublevel set persistence diagrams with cutoff $ C_\alpha^* $ are shown to the right

    Figure 12.  Experimental time series from free drop response of single pendulum with corresponding cutoffs in the persistence diagram and time-order lifetimes plot

    Figure 13.  Histogram of the highlighted section of signal in Fig. 12 with fitted Rayleigh distribution

    Table 1.  Ratios $ \rho = \bar{L} / \tilde{L} $ for estimating sample mean from the sample median with uncertainty as three standard deviations

    Distribution Gussian Uniform Rayleigh Exponential
    $ \rho = \bar{L}/\tilde{L} $ $ 1.154 \pm 0.012 $ $ 1.000 \pm 0.010 $ $ 1.136 \pm 0.013 $ $ 1.265 \pm 0.016 $
     | Show Table
    DownLoad: CSV

    Table 2.  Constants of Equation (52) for each distribution type investigated in this work with associated uncertainty from ten trials

    Distribution Gussian Uniform Rayleigh Exponential
    $ c_1 $ $ 0.845 \pm 0.029 $ $ 0.880 \pm 0.017 $ $ 0.726 \pm 0.026 $ $ 0.436 \pm 0.036 $
    $ c_2 $ $ 0.809 \pm 0.061 $ $ 0.639 \pm 0.026 $ $ 0.605 \pm 0.054 $ $ 0.393 \pm 0.075 $
     | Show Table
    DownLoad: CSV

    Table Algorithm 1.  Zero-Dimensional Persistence Algorithm

    Data: Array A of n Real Numbers
    Result: Persistence Diagram
    $\begin{array}{l} {\bf{begin}}\\ \;\;\left| \begin{array}{l} \;\;\;\;M = {\rm{list}}\;{\rm{of}}\;{\rm{local}}\;{\rm{extrema}}\;{\rm{in}}\;A\left( {{\rm{non - endpoint}}\;{\rm{maxima}}\;{\rm{and}}\;{\rm{minima}}} \right);\\ \;\;\;\;Q = {\rm{priority}}\;{\rm{queue}}\;{\rm{of}}\;{\rm{pointers}}\;{\rm{to}}\;{\rm{elements}}\;{\rm{of}}\;M;\\ \;\;\;\;{\bf{while}}\;\left| M \right| > 3\;{\bf{do}}\\ \;\;\;\;\;\;\left| \begin{array}{l} m \leftarrow Q.pop;\\ b \leftarrow {\rm{min}}\left\{ {m.height,m.next.height} \right\};\\ d \leftarrow {\rm{max}}\left\{ {m.height,m.next.height} \right\};\\ {\rm{Add}}\;(b,d){\rm{ to}}\;pairs;\\ {\rm{Update}}\;Q\;{\rm{and}}\;M; \end{array} \right.\\ \;\;\;\;{\bf{end}}\\ \;\;\;\;{\bf{return}}\;pairs \end{array} \right.\\ {\bf{end}} \end{array}$
     | Show Table
    DownLoad: CSV
  • [1] N. AtienzaR. Gonzalez-Diaz and M. Rucco, Persistent entropy for separating topological features from noise in vietoris-rips complexes, Journal of Intelligent Information Systems, 52 (2017), 637-655.  doi: 10.1007/s10844-017-0473-4.
    [2] G. E. P. Box, G. M. Jenkins, G. C. Reinsel and G. M. Ljung, Time Series Analysis: Forecasting and Control, John Wiley & Sons, 2016.
    [3] P. Bubenik, Statistical topological data analysis using persistence landscapes, Journal of Machine Learning Research, 16 (2015), 77-102. 
    [4] P. Bühlmann and P. Buhlmann, Sieve bootstrap for time series, Bernoulli, 3 (1997), 123-148.  doi: 10.2307/3318584.
    [5] G. Carlsson, J. Gorham, M. Kahle and J. Mason, Computational topology for configuration spaces of hard disks,, Physical Review E, 85 (2012), 011303. doi: 10.1103/PhysRevE. 85.011303.
    [6] F. Chazal, V. De Silva, M. Glisse and S. Oudot, The Structure and Stability of Persistence Modules, SpringerBriefs in Mathematics. Springer, [Cham], 2016. doi: 10.1007/978-3-319-42545-0.
    [7] F. Chazal, B. T. Fasy, F. Lecci, B. Michel, A. Rinaldo and L. Wasserman, Robust topological inference: Distance to a measure and kernel distance, J. Mach. Learn. Res., 18 (2018), Paper No. 159, 40 pp.
    [8] F. Chazal, B. T. Fasy, F. Lecci, A. Rinaldo, A. Singh and L. Wasserman, On the bootstrap for persistence diagrams and landscapes, arXiv preprint, arXiv: 1311.0376, 2013.
    [9] S. Chowdhury and F. Mémoli, Convergence of hierarchical clustering and persistent homology methods on directed networks, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 1152–1169, SIAM, Philadelphia, PA, 2018. doi: 10.1137/1.9781611975031.75.
    [10] D. Cohen-SteinerH. Edelsbrunner and J. Harer, Stability of persistence diagrams, Discrete & Computational Geometry, 37 (2006), 103-120.  doi: 10.1007/s00454-006-1276-5.
    [11] S. Czesla, T. Molle and J. H. M. M. Schmitt, A posteriori noise estimation in variable data sets, Astronomy & Astrophysics, 609 (2018), A39. doi: 10.1051/0004-6361/201730618.
    [12] C. J. A. Delfinado and H. Edelsbrunner, An incremental algorithm for Betti numbers of simplicial complexes on the 3-sphere, Computer Aided Geometric Design, 12 (1995), 771-784.  doi: 10.1016/0167-8396(95)00016-Y.
    [13] M. Dindin, Y. Umeda and F. Chazal, Topological data analysis for arrhythmia detection through modular neural networks, In Advances in Artificial Intelligence, 2020, 177–188. doi: 10.1007/978-3-030-47358-7_17.
    [14] H. Edelsbrunner and J. Harer, Computational Topology - an Introduction, American Mathematical Society, Providence, RI, 2010. doi: 10.1090/mbk/069.
    [15] H. Edelsbrunner and J. Harer, Persistent homology-a survey, Contemporary Mathematics, 453 (2008), 257-282.  doi: 10.1090/conm/453/08802.
    [16] H. EdelsbrunnerD. Letscher and A. Zomorodian, Topological persistence and simplification, Discrete & Computational Geometry, 28 (2002), 511-533.  doi: 10.1007/s00454-002-2885-2.
    [17] B. T. FasyF. LecciA. RinaldoL. WassermanS. BalakrishnanA. Singh and et al., Confidence sets for persistence diagrams, The Annals of Statistics, 42 (2014), 2301-2339.  doi: 10.1214/14-AOS1252.
    [18] S. Gholizadeh and W. Zadrozny, A short survey of topological data analysis in time series and systems analysis, arXiv preprint, arXiv: 1809.10745, 2018.
    [19] J. HuJ. B. Gao and K. D. White, Estimating measurement noise in a time series by exploiting nonstationarity, Chaos, Solitons & Fractals, 22 (2004), 807-819.  doi: 10.1016/j.chaos.2004.02.061.
    [20] F. A. Khasawneh and E. Munch, Utilizing topological data analysis for studying signals of time-delay systems, Time Delay Systems, Adv. Delays Dyn., Springer, Cham, 7 (2017), 93–106.
    [21] F. A. Khasawneh and E. Munch, Topological data analysis for true step detection in periodic piecewise constant signals, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Science, 474 (2018), 20180027, 24pp. doi: 10.1098/rspa. 2018.0027.
    [22] F. A. Khasawneh, E. Munch and J. A. Perea, Chatter classification in turning using machine learning and topological data analysis, IFAC-PapersOnLine, 51 (2018), 195–200. doi: 10.1016/j. ifacol. 2018.07.222.
    [23] P. Lawson, A. B. Sholl, J. Quincy Brown, B. Terese Fasy and C. Wenk, Persistent homology for the quantitative evaluation of architectural features in prostate cancer histology, Scientific Reports, 9 (2019), Article number: 1139. doi: 10.1038/s41598-018-36798-y.
    [24] H. Lee, H. Kang, M. K. Chung, B. -N. Kim and D. S. Lee, Persistent brain network homology from the perspective of dendrogram, IEEE Transactions on Medical Imaging, 31 (2012), 2267–2277. doi: 10.1109/TMI. 2012.2219590.
    [25] E. Munch, A user's guide to topological data analysis, Journal of Learning Analytics, 4, 2017. doi: 10.18608/jla. 2017.42.6.
    [26] A. Myers and F. A. Khasawneh, On the automatic parameter selection for permutation entropy, Chaos: An Interdisciplinary Journal of Nonlinear Science, 30 (2020), 033130, 17 pp. doi: 10.1063/1.5111719.
    [27] A. Myers, E. Munch and F. A Khasawneh, Persistent homology of complex networks for dynamic state detection, Phys. Rev. E, 100 (2019), 022314, 14 pp. doi: 10.1103/physreve. 100.022314.
    [28] A. Otto, M. C. Yesilli and F. A. Khasawneh, Topological feature vectors for chatter detection in turning processes, The International Journal of Advanced Manufacturing Technology, 2022. arXiv: 1905.08671. doi: 10.1007/s00170-021-08242-5.
    [29] J. A. Perea, A brief history of persistence.,
    [30] J. A. Perea and J. Harer, Sliding windows and persistence: An application of topological methods to signal analysis, Foundations of Computational Mathematics, 15 (2015), 799-838.  doi: 10.1007/s10208-014-9206-z.
    [31] D. Petrushenko and F. A. Khasawneh, Uncertainty propagation of system parameters to the dynamic response: An application to a benchtop pendulum, ASME 2017 International Mechanical Engineering Congress and Exposition, 2018, 10 pages. doi: 10.1115/IMECE2017-71105.
    [32] D. N. Politis and J. P. Romano, The stationary bootstrap, Journal of the American Statistical Association, 89 (1994), 1303-1313.  doi: 10.1080/01621459.1994.10476870.
    [33] M. Rucco, F. Castiglione, E. Merelli and M. Pettini, Characterisation of the idiotypic immune network through persistent entropy, In Proceedings of ECCS 2014, Springer International Publishing, 2016, 117–128. doi: 10.1007/978-3-319-29228-1_11.
    [34] F. Takens, Detecting strange attractors in turbulence, Dynamical Systems and Turbulence, Warwick 1980 (Coventry, 1979/1980), 366–381, Lecture Notes in Math., 898, Springer, Berlin-New York, 1981.
    [35] C. J. Tralie and J. A. Perea, (Quasi)periodicity quantification in video data, using topology, SIAM Journal on Imaging Sciences, 11 (2018), 1049–1077. doi: 10.1137/17M1150736.
    [36] K. Urbanowicz and J. A. Hołyst, Noise-level estimation of time series using coarse-grained entropy, Physical Review E, 67 (2003), 046218.  doi: 10.1103/PhysRevE.67.046218.
    [37] X. Wan, W. Wang, J. Liu and T. Tong, Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range, BMC Medical Research Methodology, 14 (2014), Article number: 135. doi: 10.1186/1471-2288-14-135.
    [38] G. R. Wood and B. P. Zhang, Estimation of the lipschitz constant of a function, Journal of Global Optimization, 8 (1996), 91-103.  doi: 10.1007/BF00229304.
  • 加载中




Article Metrics

HTML views(1468) PDF downloads(281) Cited by(0)

Access History

Other Articles By Authors



    DownLoad:  Full-Size Img  PowerPoint