# American Institute of Mathematical Sciences

March  2022, 4(1): 1-36. doi: 10.3934/fods.2021033

## Capturing dynamics of time-varying data via topology

 1 School of Information, University of Michigan, Ann Arbor, MI 48109, USA 2 Department of Mathematics, Colorado State University, Fort Collins, CO 80523, USA 3 Department of Mathematics and Statistics, Williams College, Williamstown, MA 01267, USA 4 Department of Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, MN 55105, USA

* Corresponding author: Lori Ziegelmeier

Received  June 2021 Revised  October 2021 Published  March 2022 Early access  December 2021

Fund Project: L.X. was funded by Macalester College through a grant to L.Z. H.A. was supported by NSF grant 1934725, DELTA: Descriptors of Energy Landscapes by Topological Analysis. C.M.T. was supported by NSF grant DMS-1813752, Variational and Topological Approaches to Complex Systems. L.Z. was supported by NSF grant CDS & E-MSS-1854703, Exact Homological Algebra for Computational Topology (ExHACT)

One approach to understanding complex data is to study its shape through the lens of algebraic topology. While the early development of topological data analysis focused primarily on static data, in recent years, theoretical and applied studies have turned to data that varies in time. A time-varying collection of metric spaces as formed, for example, by a moving school of fish or flock of birds, can contain a vast amount of information. There is often a need to simplify or summarize the dynamic behavior. We provide an introduction to topological summaries of time-varying metric spaces including vineyards [19], crocker plots [55], and multiparameter rank functions [37]. We then introduce a new tool to summarize time-varying metric spaces: a crocker stack. Crocker stacks are convenient for visualization, amenable to machine learning, and satisfy a desirable continuity property which we prove. We demonstrate the utility of crocker stacks for a parameter identification task involving an influential model of biological aggregations [57]. Altogether, we aim to bring the broader applied mathematics community up-to-date on topological summaries of time-varying metric spaces.

Citation: Lu Xian, Henry Adams, Chad M. Topaz, Lori Ziegelmeier. Capturing dynamics of time-varying data via topology. Foundations of Data Science, 2022, 4 (1) : 1-36. doi: 10.3934/fods.2021033
##### References:
 [1] H. Adams and G. Carlsson, Evasion paths in mobile sensor networks, International Journal of Robotics Research, 34 (2015), 90-104. [2] H. Adams, T. Emerson, M. Kirby, R. Neville, C. Peterson, P. Shipman, S. Chepushtanova, E. Hanson, F. Motta and L. Ziegelmeier, Persistence images: A stable vector representation of persistent homology, J. Mach. Learn. Res., 18 (2017), Paper No. 8, 35 pp. http://jmlr.org/papers/v18/16-337.html. [3] H. Adams, D. Ghosh, C. Mask, W. Ott and K. Williams, Efficient evader detection in mobile sensor networks, arXiv preprint, arXiv: 2101.09813. [4] P. Arora, D. Deepali and S. Varshney, Analysis of K-means and K-medoids algorithm for big data, Procedia Computer Science, 78 (2016), 507-512.  doi: 10.1016/j.procs.2016.02.095. [5] A. Banman and L. Ziegelmeier, Mind the gap: A study in global development through persistent homology, in Research in Computational Topology, Springer, 2018,125–144. doi: 10.1007/978-3-319-89593-2_8. [6] D. Bhaskar, A. Manhart, J. Milzman, J. T. Nardini, K. M. Storey, C. M. Topaz and L. Ziegelmeier, Analyzing collective motion with machine learning and topology, Chaos: An Interdisciplinary Journal of Nonlinear Science, 29 (2019), 123125, 12 pp. doi: 10.1063/1.5125493. [7] P. Bubenik, Statistical topological data analysis using persistence landscapes, J. Mach. Learn. Res., 16 (2015), 77-102. [8] D. Burago, Y. Burago and S. Ivanov, A course in Metric Geometry, vol. 33, American Mathematical Society, Providence, 2001. doi: 10.1090/gsm/033. [9] G. Carlsson, Topology and data, Bull. Amer. Math. Soc. (N.S.), 46 (2009), 255-308.  doi: 10.1090/S0273-0979-09-01249-X. [10] G. Carlsson and V. de Silva, Zigzag persistence, Found. Comput. Math., 10 (2010), 367-405.  doi: 10.1007/s10208-010-9066-0. [11] G. Carlsson, V. de Silva, S. Kališnik and D. Morozov, Parametrized homology via zigzag persistence, Algebr. Geom. Topol., 19 (2019), 657-700.  doi: 10.2140/agt.2019.19.657. [12] G. Carlsson, V. de Silva and D. Morozov, Zigzag persistent homology and real-valued functions, in Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry, ACM, 2009,247–256. doi: 10.1145/1542362.1542408. [13] G. Carlsson, G. Singh and A. Zomorodian, Computing multidimensional persistence, Algorithms and computation, 730–739, Lecture Notes in Comput. Sci., 5878, Springer, Berlin, 2009. doi: 10.1007/978-3-642-10631-6_74. [14] G. Carlsson and A. Zomorodian, The theory of multidimensional persistence, Discrete Comput. Geom., 42 (2009), 71-93.  doi: 10.1007/s00454-009-9176-0. [15] A. Cerri, B. D. Fabio, M. Ferri, P. Frosini and C. Landi, Betti numbers in multidimensional persistent homology are stable functions, Math. Methods Appl. Sci., 36 (2013), 1543-1557.  doi: 10.1002/mma.2704. [16] W. Chachólski, M. Scolamiero and F. Vaccarino, Combinatorial presentation of multidimensional persistent homology, J. Pure Appl. Algebra, 221 (2017), 1055-1075.  doi: 10.1016/j.jpaa.2016.09.001. [17] F. Chazal, V. de Silva and S. Oudot, Persistence stability for geometric complexes, Geometriae Dedicata, 174 (2014), 193-214.  doi: 10.1007/s10711-013-9937-z. [18] D. Cohen-Steiner, H. Edelsbrunner and J. Harer, Stability of persistence diagrams, Discrete Comput. Geom., 37 (2007), 103-120.  doi: 10.1007/s00454-006-1276-5. [19] D. Cohen-Steiner, H. Edelsbrunner and D. Morozov, Vines and vineyards by updating persistence in linear time, in Computational Geometry (SCG'06), ACM, 2006,119–126. doi: 10.1145/1137856.1137877. [20] P. Corcoran and C. B. Jones, Modelling topological features of swarm behaviour in space and time with persistence landscapes, IEEE Access, 5 (2017), 18534-18544.  doi: 10.1109/ACCESS.2017.2749319. [21] D. B. Damiano and M. R. McGuirl, A topological analysis of targeted in-111 uptake in SPECT images of murine tumors, J. Math. Biol., 76 (2018), 1559-1587.  doi: 10.1007/s00285-017-1184-8. [22] V. de Silva and R. Ghrist, Coordinate-free coverage in sensor networks with controlled boundaries via homology, The International Journal of Robotics Research, 25 (2006), 1205-1222.  doi: 10.1177/0278364906072252. [23] V. de Silva and R. Ghrist, Coverage in sensor networks via persistent homology, Algebr. Geom. Topol., 7 (2007), 339-358.  doi: 10.2140/agt.2007.7.339. [24] T. K. Dey and C. Xin, Computing bottleneck distance for 2-d interval decomposable modules, arXiv preprint, arXiv: 1803.02869. [25] M. R. D'Orsogna, Y. L. Chuang, A. L. Bertozzi and L. S. Chayes, Self-propelled particles with soft-core interactions: Patterns, stability, and collapse, Phys. Rev. Lett., 96 (2006), 104302.  doi: 10.1103/PhysRevLett.96.104302. [26] H. Edelsbrunner and J. L. Harer, Computational Topology: An Introduction, American Mathematical Society, Providence, 2010. doi: 10.1090/mbk/069. [27] H. Edelsbrunner, D. Morozov and A. Patel, The stability of the apparent contour of an orientable 2-manifold, Topological Methods in Data Analysis and Visualization. Mathematics and Visualization., 27–41, Math. Vis., Springer, Heidelberg, 2011. doi: 10.1007/978-3-642-15014-2_3. [28] B. T. Fasy, J. Kim, F. Lecci, C. Maria, D. L. Millman and V. Rouvreau, Tda: Statistical tools for topological data analysis, https://cran.r-project.org/web/packages/TDA/index.html. [29] M. Feng and M. A. Porter, Persistent homology of geospatial data: A case study with voting, SIAM Rev., 63 (2021), 67-99.  doi: 10.1137/19M1241519. [30] M. Feng and M. A. Porter, Spatial applications of topological data analysis: Cities, snowflakes, random structures, and spiders spinning under the influence, Phys. Rev. Research, 2 (2020), 033426.  doi: 10.1103/PhysRevResearch.2.033426. [31] R. Ghrist, Barcodes: The persistent topology of data, ull. Amer. Math. Soc. (N.S.), 45 (2008), 61-75.  doi: 10.1090/S0273-0979-07-01191-3. [32] C. Giusti, L. Papadopoulos, E. T. Owens, K. E. Daniels and D. S. Bassett, Topological and geometric measurements of force-chain structure, Physical Review E, 94 (2016), 032909.  doi: 10.1103/PhysRevE.94.032909. [33] I. T. Jolliffe, Principal Component Analysis, Springer Verlag, 1986. doi: 10.1007/978-1-4757-1904-8. [34] T. Kaczynski, K. Mischaikow and M. Mrozek, Computational Homology, vol. 157, pringer-Verlag, New York, 2004. doi: 10.1007/b97315. [35] L. Kaufman and P. Rousseeuw, Clustering by Means of Medoids, North-Holland, 1987. [36] W. Kim and F. Mémoli, Stable signatures for dynamic metric spaces via zigzag persistent homology, arXiv preprint, arXiv: 1712.04064. [37] W. Kim and F. Mémoli, Spatiotemporal persistent homology for dynamic metric spaces, Discrete Comput. Geom., 66 (2021), 831-875.  doi: 10.1007/s00454-019-00168-w. [38] M. Lesnick, The theory of the interleaving distance on multidimensional persistence modules, Found. Comput. Math., 15 (2015), 613-650.  doi: 10.1007/s10208-015-9255-y. [39] M. Maechler, Finding groups in data: Cluster analysis extended rousseeuw et al, https://cran.r-project.org/web/packages/cluster/cluster.pdf. [40] A. McCleary and A. Patel, Bottleneck stability for generalized persistence diagrams, Proc. Amer. Math. Soc., 148 (2020), 3149-3161.  doi: 10.1090/proc/14929. [41] A. McCleary and A. Patel, Edit distance and persistence diagrams over lattices, arXiv preprint, arXiv: 2010.07337. [42] E. Miller, Data structures for real multiparameter persistence modules, arXiv preprint, arXiv: 1709.08155. [43] N. Milosavljević, D. Morozov and P. Škraba, Zigzag persistent homology in matrix multiplication time, in Computational geometry (SCG'11), 2011,216–225. doi: 10.1145/1998196.1998229. [44] D. Morozov, Personal communication. [45] D. Morozov, Dionysus, http://www.mrzv.org/software/dionysus/. [46] J. R. Munkres, Topology, Prentice-Hall Englewood Cliffs, NJ, 1975. [47] C. Nilsen, J. Paige, O. Warner, B. Mayhew, R. Sutley, M. Lam, A. J. Bernoff and C. M. Topaz, Social aggregation in pea aphids: Experiment and random walk modeling, PLoS ONE, 8 (2013), e83343. doi: 10.1371/journal.pone.0083343. [48] N. Otter, M. A. Porter, U. Tillmann, P. Grindrod and H. A. Harrington, A roadmap for the computation of persistent homology, EPJ Data Science, 6 (2017), 17. [49] S. Y. Oudot, Persistence Theory: From Quiver Representations to Data Analysis, vol. 209, American Mathematical Society Providence, RI, 2015. doi: 10.1090/surv/209. [50] H.-S. Park and C.-H. Jun, A simple and fast algorithm for $k$-medoids clustering, Expert Systems with Applications, 36 (2009), 3336-3341.  doi: 10.1016/j.eswa.2008.01.039. [51] A. Patel, Generalized persistence diagrams, J. Appl. Comput. Topol., 1 (2018), 397-419.  doi: 10.1007/s41468-018-0012-6. [52] V. Puuska, Erosion distance for generalized persistence modules, Homology Homotopy Appl., 22 (2020), 233-254.  doi: 10.4310/HHA.2020.v22.n1.a14. [53] M. Scolamiero, W. Chachólski, A. Lundman, R. Ramanujam and S. Öberg, Multidimensional persistence and noise, Found. Comput. Math., 17 (2017), 1367-1406.  doi: 10.1007/s10208-016-9323-y. [54] B. J. Stolz, H. A. Harrington and M. A. Porter, Persistent homology of time-dependent functional networks constructed from coupled time series, Chaos, 27 (2017), 047410, 17 pp. doi: 10.1063/1.4978997. [55] C. M. Topaz, L. Ziegelmeier and T. Halverson, Topological data analysis of biological aggregation models, PloS One, 10 (2015), e0126383. doi: 10.1371/journal.pone.0126383. [56] M. Ulmer, L. Ziegelmeier and C. M. Topaz, A topological approach to selecting models of biological experiments, PloS One, 14 (2019), e0213679. doi: 10.1371/journal.pone.0213679. [57] T. Vicsek, A. Czirók, E. Ben-Jacob, I. Cohen and O. Shochet, Novel type of phase transition in a system of self-driven particles, Phys. Rev. Lett., 75 (1995), 1226-1229.  doi: 10.1103/PhysRevLett.75.1226. [58] T. Vicsek and A. Zafeiris, Collective motion, Physics Reports, 517 (2012), 71-140.  doi: 10.1016/j.physrep.2012.03.004. [59] X. Zhu, Persistent homology: An introduction and a new text representation for natural language processing, in Twenty-Third International Joint Conference on Artificial Intelligence, 2013. [60] A. Zomorodian and G. Carlsson, Computing persistent homology, Discrete Comput. Geom., 33 (2005), 249-274.  doi: 10.1007/s00454-004-1146-y.

show all references

##### References:
 [1] H. Adams and G. Carlsson, Evasion paths in mobile sensor networks, International Journal of Robotics Research, 34 (2015), 90-104. [2] H. Adams, T. Emerson, M. Kirby, R. Neville, C. Peterson, P. Shipman, S. Chepushtanova, E. Hanson, F. Motta and L. Ziegelmeier, Persistence images: A stable vector representation of persistent homology, J. Mach. Learn. Res., 18 (2017), Paper No. 8, 35 pp. http://jmlr.org/papers/v18/16-337.html. [3] H. Adams, D. Ghosh, C. Mask, W. Ott and K. Williams, Efficient evader detection in mobile sensor networks, arXiv preprint, arXiv: 2101.09813. [4] P. Arora, D. Deepali and S. Varshney, Analysis of K-means and K-medoids algorithm for big data, Procedia Computer Science, 78 (2016), 507-512.  doi: 10.1016/j.procs.2016.02.095. [5] A. Banman and L. Ziegelmeier, Mind the gap: A study in global development through persistent homology, in Research in Computational Topology, Springer, 2018,125–144. doi: 10.1007/978-3-319-89593-2_8. [6] D. Bhaskar, A. Manhart, J. Milzman, J. T. Nardini, K. M. Storey, C. M. Topaz and L. Ziegelmeier, Analyzing collective motion with machine learning and topology, Chaos: An Interdisciplinary Journal of Nonlinear Science, 29 (2019), 123125, 12 pp. doi: 10.1063/1.5125493. [7] P. Bubenik, Statistical topological data analysis using persistence landscapes, J. Mach. Learn. Res., 16 (2015), 77-102. [8] D. Burago, Y. Burago and S. Ivanov, A course in Metric Geometry, vol. 33, American Mathematical Society, Providence, 2001. doi: 10.1090/gsm/033. [9] G. Carlsson, Topology and data, Bull. Amer. Math. Soc. (N.S.), 46 (2009), 255-308.  doi: 10.1090/S0273-0979-09-01249-X. [10] G. Carlsson and V. de Silva, Zigzag persistence, Found. Comput. Math., 10 (2010), 367-405.  doi: 10.1007/s10208-010-9066-0. [11] G. Carlsson, V. de Silva, S. Kališnik and D. Morozov, Parametrized homology via zigzag persistence, Algebr. Geom. Topol., 19 (2019), 657-700.  doi: 10.2140/agt.2019.19.657. [12] G. Carlsson, V. de Silva and D. Morozov, Zigzag persistent homology and real-valued functions, in Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry, ACM, 2009,247–256. doi: 10.1145/1542362.1542408. [13] G. Carlsson, G. Singh and A. Zomorodian, Computing multidimensional persistence, Algorithms and computation, 730–739, Lecture Notes in Comput. Sci., 5878, Springer, Berlin, 2009. doi: 10.1007/978-3-642-10631-6_74. [14] G. Carlsson and A. Zomorodian, The theory of multidimensional persistence, Discrete Comput. Geom., 42 (2009), 71-93.  doi: 10.1007/s00454-009-9176-0. [15] A. Cerri, B. D. Fabio, M. Ferri, P. Frosini and C. Landi, Betti numbers in multidimensional persistent homology are stable functions, Math. Methods Appl. Sci., 36 (2013), 1543-1557.  doi: 10.1002/mma.2704. [16] W. Chachólski, M. Scolamiero and F. Vaccarino, Combinatorial presentation of multidimensional persistent homology, J. Pure Appl. Algebra, 221 (2017), 1055-1075.  doi: 10.1016/j.jpaa.2016.09.001. [17] F. Chazal, V. de Silva and S. Oudot, Persistence stability for geometric complexes, Geometriae Dedicata, 174 (2014), 193-214.  doi: 10.1007/s10711-013-9937-z. [18] D. Cohen-Steiner, H. Edelsbrunner and J. Harer, Stability of persistence diagrams, Discrete Comput. Geom., 37 (2007), 103-120.  doi: 10.1007/s00454-006-1276-5. [19] D. Cohen-Steiner, H. Edelsbrunner and D. Morozov, Vines and vineyards by updating persistence in linear time, in Computational Geometry (SCG'06), ACM, 2006,119–126. doi: 10.1145/1137856.1137877. [20] P. Corcoran and C. B. Jones, Modelling topological features of swarm behaviour in space and time with persistence landscapes, IEEE Access, 5 (2017), 18534-18544.  doi: 10.1109/ACCESS.2017.2749319. [21] D. B. Damiano and M. R. McGuirl, A topological analysis of targeted in-111 uptake in SPECT images of murine tumors, J. Math. Biol., 76 (2018), 1559-1587.  doi: 10.1007/s00285-017-1184-8. [22] V. de Silva and R. Ghrist, Coordinate-free coverage in sensor networks with controlled boundaries via homology, The International Journal of Robotics Research, 25 (2006), 1205-1222.  doi: 10.1177/0278364906072252. [23] V. de Silva and R. Ghrist, Coverage in sensor networks via persistent homology, Algebr. Geom. Topol., 7 (2007), 339-358.  doi: 10.2140/agt.2007.7.339. [24] T. K. Dey and C. Xin, Computing bottleneck distance for 2-d interval decomposable modules, arXiv preprint, arXiv: 1803.02869. [25] M. R. D'Orsogna, Y. L. Chuang, A. L. Bertozzi and L. S. Chayes, Self-propelled particles with soft-core interactions: Patterns, stability, and collapse, Phys. Rev. Lett., 96 (2006), 104302.  doi: 10.1103/PhysRevLett.96.104302. [26] H. Edelsbrunner and J. L. Harer, Computational Topology: An Introduction, American Mathematical Society, Providence, 2010. doi: 10.1090/mbk/069. [27] H. Edelsbrunner, D. Morozov and A. Patel, The stability of the apparent contour of an orientable 2-manifold, Topological Methods in Data Analysis and Visualization. Mathematics and Visualization., 27–41, Math. Vis., Springer, Heidelberg, 2011. doi: 10.1007/978-3-642-15014-2_3. [28] B. T. Fasy, J. Kim, F. Lecci, C. Maria, D. L. Millman and V. Rouvreau, Tda: Statistical tools for topological data analysis, https://cran.r-project.org/web/packages/TDA/index.html. [29] M. Feng and M. A. Porter, Persistent homology of geospatial data: A case study with voting, SIAM Rev., 63 (2021), 67-99.  doi: 10.1137/19M1241519. [30] M. Feng and M. A. Porter, Spatial applications of topological data analysis: Cities, snowflakes, random structures, and spiders spinning under the influence, Phys. Rev. Research, 2 (2020), 033426.  doi: 10.1103/PhysRevResearch.2.033426. [31] R. Ghrist, Barcodes: The persistent topology of data, ull. Amer. Math. Soc. (N.S.), 45 (2008), 61-75.  doi: 10.1090/S0273-0979-07-01191-3. [32] C. Giusti, L. Papadopoulos, E. T. Owens, K. E. Daniels and D. S. Bassett, Topological and geometric measurements of force-chain structure, Physical Review E, 94 (2016), 032909.  doi: 10.1103/PhysRevE.94.032909. [33] I. T. Jolliffe, Principal Component Analysis, Springer Verlag, 1986. doi: 10.1007/978-1-4757-1904-8. [34] T. Kaczynski, K. Mischaikow and M. Mrozek, Computational Homology, vol. 157, pringer-Verlag, New York, 2004. doi: 10.1007/b97315. [35] L. Kaufman and P. Rousseeuw, Clustering by Means of Medoids, North-Holland, 1987. [36] W. Kim and F. Mémoli, Stable signatures for dynamic metric spaces via zigzag persistent homology, arXiv preprint, arXiv: 1712.04064. [37] W. Kim and F. Mémoli, Spatiotemporal persistent homology for dynamic metric spaces, Discrete Comput. Geom., 66 (2021), 831-875.  doi: 10.1007/s00454-019-00168-w. [38] M. Lesnick, The theory of the interleaving distance on multidimensional persistence modules, Found. Comput. Math., 15 (2015), 613-650.  doi: 10.1007/s10208-015-9255-y. [39] M. Maechler, Finding groups in data: Cluster analysis extended rousseeuw et al, https://cran.r-project.org/web/packages/cluster/cluster.pdf. [40] A. McCleary and A. Patel, Bottleneck stability for generalized persistence diagrams, Proc. Amer. Math. Soc., 148 (2020), 3149-3161.  doi: 10.1090/proc/14929. [41] A. McCleary and A. Patel, Edit distance and persistence diagrams over lattices, arXiv preprint, arXiv: 2010.07337. [42] E. Miller, Data structures for real multiparameter persistence modules, arXiv preprint, arXiv: 1709.08155. [43] N. Milosavljević, D. Morozov and P. Škraba, Zigzag persistent homology in matrix multiplication time, in Computational geometry (SCG'11), 2011,216–225. doi: 10.1145/1998196.1998229. [44] D. Morozov, Personal communication. [45] D. Morozov, Dionysus, http://www.mrzv.org/software/dionysus/. [46] J. R. Munkres, Topology, Prentice-Hall Englewood Cliffs, NJ, 1975. [47] C. Nilsen, J. Paige, O. Warner, B. Mayhew, R. Sutley, M. Lam, A. J. Bernoff and C. M. Topaz, Social aggregation in pea aphids: Experiment and random walk modeling, PLoS ONE, 8 (2013), e83343. doi: 10.1371/journal.pone.0083343. [48] N. Otter, M. A. Porter, U. Tillmann, P. Grindrod and H. A. Harrington, A roadmap for the computation of persistent homology, EPJ Data Science, 6 (2017), 17. [49] S. Y. Oudot, Persistence Theory: From Quiver Representations to Data Analysis, vol. 209, American Mathematical Society Providence, RI, 2015. doi: 10.1090/surv/209. [50] H.-S. Park and C.-H. Jun, A simple and fast algorithm for $k$-medoids clustering, Expert Systems with Applications, 36 (2009), 3336-3341.  doi: 10.1016/j.eswa.2008.01.039. [51] A. Patel, Generalized persistence diagrams, J. Appl. Comput. Topol., 1 (2018), 397-419.  doi: 10.1007/s41468-018-0012-6. [52] V. Puuska, Erosion distance for generalized persistence modules, Homology Homotopy Appl., 22 (2020), 233-254.  doi: 10.4310/HHA.2020.v22.n1.a14. [53] M. Scolamiero, W. Chachólski, A. Lundman, R. Ramanujam and S. Öberg, Multidimensional persistence and noise, Found. Comput. Math., 17 (2017), 1367-1406.  doi: 10.1007/s10208-016-9323-y. [54] B. J. Stolz, H. A. Harrington and M. A. Porter, Persistent homology of time-dependent functional networks constructed from coupled time series, Chaos, 27 (2017), 047410, 17 pp. doi: 10.1063/1.4978997. [55] C. M. Topaz, L. Ziegelmeier and T. Halverson, Topological data analysis of biological aggregation models, PloS One, 10 (2015), e0126383. doi: 10.1371/journal.pone.0126383. [56] M. Ulmer, L. Ziegelmeier and C. M. Topaz, A topological approach to selecting models of biological experiments, PloS One, 14 (2019), e0213679. doi: 10.1371/journal.pone.0213679. [57] T. Vicsek, A. Czirók, E. Ben-Jacob, I. Cohen and O. Shochet, Novel type of phase transition in a system of self-driven particles, Phys. Rev. Lett., 75 (1995), 1226-1229.  doi: 10.1103/PhysRevLett.75.1226. [58] T. Vicsek and A. Zafeiris, Collective motion, Physics Reports, 517 (2012), 71-140.  doi: 10.1016/j.physrep.2012.03.004. [59] X. Zhu, Persistent homology: An introduction and a new text representation for natural language processing, in Twenty-Third International Joint Conference on Artificial Intelligence, 2013. [60] A. Zomorodian and G. Carlsson, Computing persistent homology, Discrete Comput. Geom., 33 (2005), 249-274.  doi: 10.1007/s00454-004-1146-y.
A filled-in disk (left) has Betti numbers $(\beta_0, \beta_1, \beta_2, \beta_3, \ldots) = (1, 0, 0, 0, \ldots)$. A hollow square (center) has Betti numbers $(1, 1, 0, 0, \ldots)$. A hollow two-torus (right) has Betti numbers $(1, 2, 1, 0, \ldots)$. If we consider the union of these three shapes, the Betti numbers are $(\beta_0, \beta_1, \beta_2, \beta_3, \ldots) = (3, 3, 1, 0, \ldots)$. Image of the torus taken from Wikimedia Commons https://commons.wikimedia.org/wiki/File:Torus.svg, available for reuse under CCA BY-SA 3.0
An illustrative example of a crocker stack for $H_0$ computed from a simulation from the Viscek model with noise parameter $\eta = 0.02$; see Section 5. The variable $\alpha$ is a smoothing parameter which captures the persistence of topological features
Persistence diagram (Left) and corresponding persistence barcode (Right). Let the persistence barcode $V$ consist of the intervals $[1, 7]$, $[2, 9]$, $[3, 11]$, $[5, 10]$, $[5, 9]$, and let $4 = i \le j = 8$. Then $\mathrm{rank}(V(4) \to V(8))$ is two since there are two intervals in the persistence diagram that contain the interval $[i, j] = [4, 8]$
Round points (in red) represent points in persistence diagram $A$. Triangular points (in blue) represent points in persistence diagram $B$. The bottleneck distance between persistence diagrams $A$ and $B$ is computed by taking the largest $L_\infty$ distance between matched round and triangular points of a bijection between $A$ and $B$, and then taking the infimum over all bijections. The three boxes (in green) show the optimal matching of three pairs of round and triangular points. The unboxed points match to the closest points on the diagonal
Vines and vineyard. Each point (in red) represents a point on a persistence diagram. Each dashed curve (in blue) is a vine traced out by a persistent point on time-varying persistence diagrams. The horizontal direction denotes time
The effect of $\alpha$-smoothing. (Top) A persistence diagram, the corresponding persistence intervals (drawn vertically), and one column of a crocker plot matrix. If we had points moving in time, then we would get a time-varying persistence diagram, a time-varying persistence barcode, and a complete crocker plot matrix (swept out from left to right as time increases). (Bottom) A persistence diagram with the thick line (in red) reflecting the choice of $\alpha$-smoothing, along with the corresponding $\alpha$-smoothed persistence intervals, and one column of an $\alpha$-smoothed crocker plot matrix. The $y$-intercept of the diagonal thick red line is $2\alpha$. All persistence diagram points under the thick line are ignored under $\alpha$-smoothing
To update the heading of an agent (centered, round blue point) according to the Vicsek model, we first find the nearby neighbors within a radius $R$ (denoted by the dashed circle in red) and then take the average of its neighbors' headings, plus some noise
A plot of order parameters for three simulations of the Vicsek model with different noise parameters $\eta$. For smaller values of $\eta$, particles become more aligned, i.e. move in the same direction, over time
An example $H_0$ crocker plot of a simulation from the Viscek model with noise parameter $\eta = 0.02$. This is the same as an $\alpha$-cross section of a crocker stack when $\alpha$ = 0. In the region below the lowest curve (purple) where $\beta_0 \geq 5$, there can be many connected components, which we interpret as noise and is thus not displayed
An example $H_0$ and $H_1$ crocker stack for a simulation from the Viscek model with noise parameter $\eta = 0.02$. This figure shows the shifts of Betti curves in $H_0$ and $H_1$ as smoothing parameter $\alpha$ increases from $0$ to $0.01$ and $0.03$
The color scale corresponds to values in the distance matrix; denser color (red) means larger distances, and lighter color (yellow) means smaller distances. The 100 simulations of each noise parameter $\eta = 0.01, 0.5, 1, 1.5, 2$ are listed in order and annotated in the left matrix. The $H_{0, 1}$ crocker distance matrix is more structured (Left) than the order parameter distance matrix (Right)
The $H_{0, 1}$ crocker distance matrix is more structured (Left) than the order parameter distance matrix (Right)
The $H_{0, 1}$ crocker distance matrix is more structured (Left) than the order parameter distance matrix (Right)
The $H_{0, 1}$ crocker distance matrix is more structured (Left) than the order parameter distance matrix (Right)
Let $Z = \mathbb{R}^2$ with the Euclidean metric. The thicker solid curve (in red) represents subset $X$ of $Z$, and the thinner solid curve (in blue) represents subset $Y$ of $Z$. To compute the Hausdorff distance between $X$ and $Y$, we first take the supremum over all points in $Y$ of the distance to the closest point in $X$. In this figure, the distance is $a = \sup_{y \in Y} \inf_{x \in X} d(x, y)$. Then, we do the same for the supremum over all points in $X$ of the distance to the closest point in $Y$, as shown by $b = \sup_{x\in X} \inf_{y \in Y} d(x, y)$. Finally, we take the maximum of the two suprema, $d_H^Z(X, Y) = \max\{a, b\} = a$
Consider the persistence diagrams $\mathrm{Dgm}_k(V) = \{ (3, 6), (2, 8) \}$, denoted with red circles, and $\mathrm{Dgm}_k(W) = \{ (1, 7), (3, 7.5) \}$, denoted with blue triangles. The bottleneck distance between the two persistence diagrams is 1.5, while the erosion distance is 1
Suppose $A$ and $B$ are dynamic metric spaces. The distance between any two of the four round points (in red) in $A$ is 1 for all times $t$. The distance between any two of the four triangular points (in blue) in $B$ is $1+\varepsilon$ for all times $t$. At an arbitrary time step $t$ and scale parameter $1+\frac{\varepsilon}{2}$, we have $\beta_0^A = 1$ and $\beta_0^B = 4$
Summary of the clustering accuracy on four different experiments (abbreviated Exp.) with three different feature vectors: order parameters, crocker plots, and crocker stacks. For crocker plots and crocker stacks, we distinguish different homological dimensions: $H_{0, 1}$, $H_0$, and $H_1$. The top accuracy scores of each column are bolded. This table summarizes results with time step 1 for order parameters and time step 10 for crocker representations. Results with other time steps are discussed in Section 5.3.1. Clustering results with feature vectors that have been reduced to 3 dimensionsby PCA are shown in italics, while full feature vectors are not italicized
 Exp. 1 Exp. 2 Exp. 3 Exp. 4 Order Parameters 0.63 0.51 0.61 0.59 0.35 0.35 0.21 0.17 Crocker Plots, $H_{0, 1}$ 1.00 1.00 0.67 0.67 0.44 0.43 0.42 0.43 Crocker Plots, $H_0$ 1.00 1.00 0.67 0.77 0.45 0.43 0.39 0.43 Crocker Plots, $H_1$ 0.98 0.99 0.71 0.67 0.36 0.35 0.37 0.33 Crocker Stacks, $H_{0, 1}$ 1.00 0.98 0.67 0.67 0.47 0.38 0.41 0.35 Crocker Stacks, $H_0$ 1.00 1.00 0.67 0.67 0.49 0.46 0.41 0.41 Crocker Stacks, $H_1$ 0.96 0.98 0.63 0.67 0.34 0.37 0.32 0.35
 Exp. 1 Exp. 2 Exp. 3 Exp. 4 Order Parameters 0.63 0.51 0.61 0.59 0.35 0.35 0.21 0.17 Crocker Plots, $H_{0, 1}$ 1.00 1.00 0.67 0.67 0.44 0.43 0.42 0.43 Crocker Plots, $H_0$ 1.00 1.00 0.67 0.77 0.45 0.43 0.39 0.43 Crocker Plots, $H_1$ 0.98 0.99 0.71 0.67 0.36 0.35 0.37 0.33 Crocker Stacks, $H_{0, 1}$ 1.00 0.98 0.67 0.67 0.47 0.38 0.41 0.35 Crocker Stacks, $H_0$ 1.00 1.00 0.67 0.67 0.49 0.46 0.41 0.41 Crocker Stacks, $H_1$ 0.96 0.98 0.63 0.67 0.34 0.37 0.32 0.35
Confusion matrix using $K$-medoids to cluster the $H_{0, 1}$ crocker plots corresponding to simulations of Experiment 3 ($\eta$ = 0.01, 0.02, 0.19, 0.2, 1.99, 2). The rows represent the actual simulations corresponding to each noise parameter $\eta$ while the columns represent the parameter of the cluster medoid to which a simulation is assigned. Even though $K = 6$ clusters were formed, the six medoids correspond to simulations from only three distinct noise parameters $\eta = 0.02, 0.2, 2$
 $K$-medoids Clusters $\eta$ = 0.02 $\eta$ = 0.20 $\eta$ = 2.00 $\eta$ = 0.01 69 31 0 $\eta$ = 0.02 64 36 0 $\eta$ = 0.19 4 96 99 $\eta$ = 0.2 1 99 0 $\eta$ = 1.99 0 0 100 $\eta$ = 2.00 0 0 100
 $K$-medoids Clusters $\eta$ = 0.02 $\eta$ = 0.20 $\eta$ = 2.00 $\eta$ = 0.01 69 31 0 $\eta$ = 0.02 64 36 0 $\eta$ = 0.19 4 96 99 $\eta$ = 0.2 1 99 0 $\eta$ = 1.99 0 0 100 $\eta$ = 2.00 0 0 100
Summary of the clustering accuracy for the four experiments based on single $\alpha$-smoothed crocker plots in $H_{0, 1}$ as input feature vectors to $K$-medoids, and the accuracy based on the crocker stack (with 18 $\alpha$ values combined). The top accuracy scores of each column are bolded. Recall that when $\alpha = 0$, the $\alpha$-smoothed crocker plot is equivalent to the standard crocker plot of [55]
 Exp. 1 Exp. 2 Exp. 3 Exp. 4 stack 1.00 0.67 0.47 0.41 $\alpha$ = 0.00 1.00 0.67 0.44 0.42 $\alpha$ = 0.01 1.00 0.67 0.46 0.40 $\alpha$ = 0.03 1.00 0.67 0.49 0.40 $\alpha$ = 0.05 1.00 0.58 0.44 0.38 $\alpha$ = 0.08 0.99 0.67 0.39 0.36 $\alpha$ = 0.11 0.97 0.67 0.33 0.35 $\alpha$ = 0.13 0.92 0.73 0.41 0.28 $\alpha$ = 0.17 0.73 0.66 0.40 0.21
 Exp. 1 Exp. 2 Exp. 3 Exp. 4 stack 1.00 0.67 0.47 0.41 $\alpha$ = 0.00 1.00 0.67 0.44 0.42 $\alpha$ = 0.01 1.00 0.67 0.46 0.40 $\alpha$ = 0.03 1.00 0.67 0.49 0.40 $\alpha$ = 0.05 1.00 0.58 0.44 0.38 $\alpha$ = 0.08 0.99 0.67 0.39 0.36 $\alpha$ = 0.11 0.97 0.67 0.33 0.35 $\alpha$ = 0.13 0.92 0.73 0.41 0.28 $\alpha$ = 0.17 0.73 0.66 0.40 0.21
Comparison of the $K$-medoids clustering accuracy of Experiment 2 of the Euclidean distance on order parameters, crocker plots, and crocker stacks, as well as the bottleneck distance on the stacked set of persistence diagrams. All topological representations compute homology in dimensions 0 and 1, denoted $H_0$ and $H_1$. The top accuracy scores of each column are bolded. The parentheses on the order parameter row indicate that the same computation is performed in both columns since order parameters do not incorporate homology dimensions
 $H_0$ $H_1$ Order Parameters (0.61) (0.61) Crocker Plots 0.67 0.71 Crocker Stacks 0.67 0.63 Stacked Persistence Diagrams 0.67 0.49
 $H_0$ $H_1$ Order Parameters (0.61) (0.61) Crocker Plots 0.67 0.71 Crocker Stacks 0.67 0.63 Stacked Persistence Diagrams 0.67 0.49
 [1] Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031 [2] Andreas Chirstmann, Qiang Wu, Ding-Xuan Zhou. Preface to the special issue on analysis in machine learning and data science. Communications on Pure and Applied Analysis, 2020, 19 (8) : i-iii. doi: 10.3934/cpaa.2020171 [3] Marc Bocquet, Julien Brajard, Alberto Carrassi, Laurent Bertino. Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization. Foundations of Data Science, 2020, 2 (1) : 55-80. doi: 10.3934/fods.2020004 [4] Yang Kuang, John D. Nagy, James J. Elser. Biological stoichiometry of tumor dynamics: Mathematical models and analysis. Discrete and Continuous Dynamical Systems - B, 2004, 4 (1) : 221-240. doi: 10.3934/dcdsb.2004.4.221 [5] George Siopsis. Quantum topological data analysis with continuous variables. Foundations of Data Science, 2019, 1 (4) : 419-431. doi: 10.3934/fods.2019017 [6] Tyrus Berry, Timothy Sauer. Consistent manifold representation for topological data analysis. Foundations of Data Science, 2019, 1 (1) : 1-38. doi: 10.3934/fods.2019001 [7] Erik Carlsson, John Gunnar Carlsson, Shannon Sweitzer. Applying topological data analysis to local search problems. Foundations of Data Science, 2022  doi: 10.3934/fods.2022006 [8] José Antonio Carrillo, Seung-Yeal Ha, Lorenzo Pareschi, Benedetto Piccoli. Special issue on mathematical models for collective dynamics. Networks and Heterogeneous Media, 2020, 15 (3) : i-i. doi: 10.3934/nhm.2020020 [9] Zhouchen Lin. A review on low-rank models in data analysis. Big Data & Information Analytics, 2016, 1 (2&3) : 139-161. doi: 10.3934/bdia.2016001 [10] Expeditho Mtisi, Herieth Rwezaura, Jean Michel Tchuenche. A mathematical analysis of malaria and tuberculosis co-dynamics. Discrete and Continuous Dynamical Systems - B, 2009, 12 (4) : 827-864. doi: 10.3934/dcdsb.2009.12.827 [11] Nur Aidya Hanum Aizam, Louis Caccetta. Computational models for timetabling problem. Numerical Algebra, Control and Optimization, 2014, 4 (3) : 269-285. doi: 10.3934/naco.2014.4.269 [12] Émilie Chouzenoux, Henri Gérard, Jean-Christophe Pesquet. General risk measures for robust machine learning. Foundations of Data Science, 2019, 1 (3) : 249-269. doi: 10.3934/fods.2019011 [13] Ana Rita Nogueira, João Gama, Carlos Abreu Ferreira. Causal discovery in machine learning: Theories and applications. Journal of Dynamics and Games, 2021, 8 (3) : 203-231. doi: 10.3934/jdg.2021008 [14] Hasan Hosseini-Nasab, Vahid Ettehadi. Development of opened-network data envelopment analysis models under uncertainty. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2022027 [15] Oluwaseun Sharomi, Chandra N. Podder, Abba B. Gumel, Baojun Song. Mathematical analysis of the transmission dynamics of HIV/TB coinfection in the presence of treatment. Mathematical Biosciences & Engineering, 2008, 5 (1) : 145-174. doi: 10.3934/mbe.2008.5.145 [16] Stephen Pankavich, Christian Parkinson. Mathematical analysis of an in-host model of viral dynamics with spatial heterogeneity. Discrete and Continuous Dynamical Systems - B, 2016, 21 (4) : 1237-1257. doi: 10.3934/dcdsb.2016.21.1237 [17] Abhinav Tandon. Crop - Weed interactive dynamics in the presence of herbicides: Mathematical modeling and analysis. Discrete and Continuous Dynamical Systems - B, 2022, 27 (8) : 4589-4618. doi: 10.3934/dcdsb.2021244 [18] Chris Good, Sergio Macías. What is topological about topological dynamics?. Discrete and Continuous Dynamical Systems, 2018, 38 (3) : 1007-1031. doi: 10.3934/dcds.2018043 [19] Sebastien Motsch, Mehdi Moussaïd, Elsa G. Guillot, Mathieu Moreau, Julien Pettré, Guy Theraulaz, Cécile Appert-Rolland, Pierre Degond. Modeling crowd dynamics through coarse-grained data analysis. Mathematical Biosciences & Engineering, 2018, 15 (6) : 1271-1290. doi: 10.3934/mbe.2018059 [20] Jelena Grbić, Jie Wu, Kelin Xia, Guo-Wei Wei. Aspects of topological approaches for data science. Foundations of Data Science, 2022, 4 (2) : 165-216. doi: 10.3934/fods.2022002

Impact Factor: