doi: 10.3934/fods.2021024
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

Generalized penalty for circular coordinate representation

1. 

Lawrence Berkeley National Laboratory 1 Cyclotron Rd, Berkeley, CA 94720, USA

2. 

Department of Statistics, College of Arts and Sciences Ohio State University, 1958 Neil Ave Columbus, OH 43210, USA

3. 

Indiana University Network Science Institute (IUNI) Indiana University 1001 IN-45/46 E SR Bypass Bloomington, IN 47408, USA

4. 

DataShape team, Inria Saclay, rue Michel Magat, Faculté des Sciences d'Orsay, Université Paris-Saclay, Orsay, Île-De-France 91400, France

5. 

LMO, Université Paris-Saclay Bâtiment 307, rue Michel Magat Faculté des Sciences d'Orsay, Université Paris-Saclay Orsay, Île-De-France 91400, France

6. 

Department of Mathematics, CUNY College of Staten Island, Computer Science Programme at CUNY Graduate Center, 2800 Victory Boulevard, 1S-215, Staten Island, NY 10314, USA

* Corresponding author: Hengrui Luo

Received  March 2021 Revised  June 2021 Early access September 2021

Topological Data Analysis (TDA) provides novel approaches that allow us to analyze the geometrical shapes and topological structures of a dataset. As one important application, TDA can be used for data visualization and dimension reduction. We follow the framework of circular coordinate representation, which allows us to perform dimension reduction and visualization for high-dimensional datasets on a torus using persistent cohomology. In this paper, we propose a method to adapt the circular coordinate framework to take into account the roughness of circular coordinates in change-point and high-dimensional applications. To do that, we use a generalized penalty function instead of an $ L_{2} $ penalty in the traditional circular coordinate algorithm. We provide simulation experiments and real data analyses to support our claim that circular coordinates with generalized penalty will detect the change in high-dimensional datasets under different sampling schemes while preserving the topological structures.

Citation: Hengrui Luo, Alice Patania, Jisu Kim, Mikael Vejdemo-Johansson. Generalized penalty for circular coordinate representation. Foundations of Data Science, doi: 10.3934/fods.2021024
References:
[1]

H. Adams, A. Tausz and M. Vejdemo-Johansson, JavaPlex: A research software package for persistent (co) homology, in International Congress on Mathematical Software, Lecture Notes in Computer Science, 8592, Springer, Berlin, Heidelberg, (2014), 129–136. doi: 10.1007/978-3-662-44199-2_23.  Google Scholar

[2]

C. C. Aggarwal, A. Hinneburg and D. A. Keim, On the surprising behavior of distance metrics in high dimensional space, in International Conference on Database Theory, Lecture Notes in Computer Science, 1973, Springer, Berlin, Heidelberg, (2001), 420–434. doi: 10.1007/3-540-44503-X_27.  Google Scholar

[3]

J. Alman and V. Vassilevska Williams, A refined laser method and faster matrix multiplication, in Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM, Philadelphia, PA, (2021), 522–539. doi: 10.1137/1.9781611976465.32.  Google Scholar

[4]

M. Basseville and I. V. Nikiforov, Detection of Abrupt Changes: Theory and Application, Prentice Hall Information and System Sciences Series, Prentice Hall, Inc., Englewood Cliffs, NJ, 1993.  Google Scholar

[5]

U. Bauer, Ripser: Efficient computation of Vietoris-Rips persistence barcodes, J. Appl. Comput. Topol., 5 (2021), 391-423.  doi: 10.1007/s41468-021-00071-5.  Google Scholar

[6]

M. Belkin and P. Niyogi, Laplacian Eigenmaps for dimensionality reduction and data representation, Neural Computation, 15 (2003), 1373-1396.  doi: 10.1162/089976603321780317.  Google Scholar

[7]

E. Berberich and M. Kerber, Exact arrangements on tori and Dupin cyclides, in Proceedings of the 2008 ACM Symposium on Solid and Physical Modeling, ACM, (2008), 59–66. doi: 10.1145/1364901.1364912.  Google Scholar

[8]

T. Caliński and J. Harabasz, A dendrite method for cluster analysis, Comm. Statist., 3 (1974), 1-27.  doi: 10.1080/03610927408827101.  Google Scholar

[9]

E. J. Candès, Mathematics of sparsity (and a few other things), in Proceedings of the International Congress of Mathematicians—Seoul 2014. Vol. 1, Kyung Moon Sa, Seoul, 2014,235–258.  Google Scholar

[10]

G. Carlsson, Topology and data, Bull. Amer. Math. Soc. (N.S.), 46 (2009), 255-308.  doi: 10.1090/S0273-0979-09-01249-X.  Google Scholar

[11]

D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions, J. Symbolic Comput., 9 (1990), 251-280.  doi: 10.1016/S0747-7171(08)80013-2.  Google Scholar

[12]

D. L. Davies and D. W. Bouldin, A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1 (1979), 224-227.  doi: 10.1109/TPAMI.1979.4766909.  Google Scholar

[13]

R. C. de Amorim and B. Mirkin, Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering, Pattern Recognition, 45 (2012), 1061-1075.  doi: 10.1016/j.patcog.2011.08.012.  Google Scholar

[14]

V. de SilvaD. Morozov and M. Vejdemo-Johansson, Persistent cohomology and circular coordinates, Discrete Comput. Geom., 45 (2011), 737-759.  doi: 10.1007/s00454-011-9344-x.  Google Scholar

[15]

David L. Donoho and Ca rrie Grimes, Hessian eigenmaps: Locally linear em-bedding techniques for high-dimensional data, Proceedings of the National Academy of Sciences, 100 (2003), 5591-5596.  doi: 10.1073/pnas.1031596100.  Google Scholar

[16]

M. Elad, Sparse and Redundant Representations. From Theory to Applications in Signal and Image Processing, Springer, New York, 2010. doi: 10.1007/978-1-4419-7011-4.  Google Scholar

[17]

B. T. FasyF. LecciA. RinaldoL. WassermanS. Balakrishnan and A. Singh, Confidence sets for persistence diagrams, Ann. Statist., 42 (2014), 2301-2339.  doi: 10.1214/14-AOS1252.  Google Scholar

[18]

A. Gupta and R. Bowden, Evaluating dimensionality reduction techniques for visual category recognition using rényi entropy, 19th European Signal Processing Conference, IEEE, Barcelona, Spain, 2011. Google Scholar

[19] T. HastieR. Tibshirani and M. Wainwright, Statistical Learning with Sparsity. The Lasso and Generalizations, Monographs on Statistics and Applied Probability, 143, CRC Press, Boca Raton, FL, 2015.   Google Scholar
[20] A. Hatcher, Algebraic Topology, Cambridge University Press, Cambridge, 2002.   Google Scholar
[21]

S. Holmes, Personal communication, 2020. Google Scholar

[22]

D. P. Kingma and J. L. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980. Google Scholar

[23]

G. KraemerM. Reichstein and M. D. Mahecha, dimRed and coRanking–-Unifying dimensionality reduction in R, The R Journal, 10 (2018), 342-358.  doi: 10.32614/RJ-2018-039.  Google Scholar

[24]

J. H. Krijthe, Rtsne: T-Distributed Stochastic Neighbor Embedding Using Barnes-Hut Implementation, 2015., Available from: https://github.com/jkrijthe/Rtsne. Google Scholar

[25]

J. A. Lee and M. Verleysen, Quality assessment of dimensionality reduction: Rank-based criteria, Neurocomputing, 72 (2009), 1431-1443.  doi: 10.1016/j.neucom.2008.12.017.  Google Scholar

[26]

W. Lueks, B. Mokbel, M. Biehl and B. Hammer, How to evaluate dimensionality reduction?–Improving the co-ranking matrix, preprint, arXiv: 1110.3917. Google Scholar

[27]

H. Luo and D. Li, Spherical rotation dimension reduction with geometric loss functions, work in progress. Google Scholar

[28]

H. Luo, S. N. MacEachern and M. Peruggia, Asymptotics of lower dimensional zero-density regions, preprint, arXiv: arXiv: 2006.02568. Google Scholar

[29]

L. McInnes, J. Healy and J. Melville, UMAP: Uniform manifold approximation and projection for dimension reduction, preprint, arXiv: 1802.03426. Google Scholar

[30]

B. Michel, A Statistical Approach to Topological Data Analysis, Ph.D thesis, UPMC Université Paris Ⅵ, 2015. Google Scholar

[31]

N. Milosavljević, D. Morozov and P. Škraba, Zigzag persistent homology in matrix multiplication time, Computational Geometry (SCG'11), ACM, New York, (2011), 216–225. doi: 10.1145/1998196.1998229.  Google Scholar

[32]

D. Morozov, Dionysus2: A library for computing persistent homology., Available from: https://github.com/mrzv/dionysus. Google Scholar

[33]

P. NiyogiS. Smale and S. Weinberger, Finding the homology of submanifolds with high confidence from random samples, Discrete Comput. Geom., 39 (2008), 419-441.  doi: 10.1007/s00454-008-9053-2.  Google Scholar

[34]

N. OtterM. A. PorterU. TillmannP. Grindrod and H. A. Harrington, A roadmap for the computation of persistent homology, EPJ Data Science, 6 (2017), 1-38.  doi: 10.1140/epjds/s13688-017-0109-5.  Google Scholar

[35]

E. S. Page, Continuous inspection schemes, Biometrika, 41 (1954), 100-115.  doi: 10.1093/biomet/41.1-2.100.  Google Scholar

[36]

L. Polanco and J. A. Perea, Coordinatizing data with lens spaces and persistent cohomology, preprint, arXiv: 1905.00350. Google Scholar

[37]

M. Robinson, Multipath-dominant, pulsed doppler analysis of rotating blades, preprint, arXiv: 1204.4366. Google Scholar

[38]

M. Robinson, Personal communication, 2019. Google Scholar

[39]

E. Ronchetti, The main contributions of robust statistics to statistical science and a new challenge, METRON, 79 (2021), 127-135.  doi: 10.1007/s40300-020-00185-3.  Google Scholar

[40]

A. Tausz and G. Carlsson, Applications of zigzag persistence to topological data analysis, preprint, arXiv: 1108.3545. Google Scholar

[41]

L. van der Maaten and G. Hinton, Visualizing data using t-SNE, J. Mach. Learning Res., 9(2008), 2579{2605. Available from: http://jmlr.org/papers/v9/vandermaaten08a.html. Google Scholar

[42]

M. Vejdemo-Johansson, G. Carlsson, P. Y. Lum, A. Lehman, G. Singh and T. Ishkhanov, The topology of politics: Voting connectivity in the US House of Representatives, in NIPS 2012 Workshop on Algebraic Topology and Machine Learning, 2012. Google Scholar

[43]

L. VendraminR. J. G. B. Campello and E. R. Hruschka, Relative clustering validity criteria: A comparative overview, Stat. Anal. Data Min., 3 (2010), 209-235.  doi: 10.1002/sam.10080.  Google Scholar

[44]

R. VidalY. Ma and S. S. Sastry, Generalized principal component analysis (GPCA), IEEE Transactions on Pattern Analysis and Machine Intelligence, 27 (2005), 1945-1959.  doi: 10.1109/TPAMI.2005.244.  Google Scholar

[45]

B. WangB. SummaV. Pascucci and M. Vejdemo-Johansson, Branching and circular features in high dimensional data, IEEE Transactions on Visualization and Computer Graphics, 17 (2011), 1902-1911.  doi: 10.1109/TVCG.2011.177.  Google Scholar

[46]

E. Zemel, An O(n) algorithm for the linear multiple choice knapsack problem and related problems, Inform. Process. Lett., 18 (1984), 123-128.  doi: 10.1016/0020-0190(84)90014-0.  Google Scholar

[47]

X. Zhu, Persistent homology: An introduction and a new text representation for natural language processing, in Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, 2013, 1953–1959. Available from: https://www.ijcai.org/Proceedings/13/Papers/288.pdf. Google Scholar

show all references

References:
[1]

H. Adams, A. Tausz and M. Vejdemo-Johansson, JavaPlex: A research software package for persistent (co) homology, in International Congress on Mathematical Software, Lecture Notes in Computer Science, 8592, Springer, Berlin, Heidelberg, (2014), 129–136. doi: 10.1007/978-3-662-44199-2_23.  Google Scholar

[2]

C. C. Aggarwal, A. Hinneburg and D. A. Keim, On the surprising behavior of distance metrics in high dimensional space, in International Conference on Database Theory, Lecture Notes in Computer Science, 1973, Springer, Berlin, Heidelberg, (2001), 420–434. doi: 10.1007/3-540-44503-X_27.  Google Scholar

[3]

J. Alman and V. Vassilevska Williams, A refined laser method and faster matrix multiplication, in Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM, Philadelphia, PA, (2021), 522–539. doi: 10.1137/1.9781611976465.32.  Google Scholar

[4]

M. Basseville and I. V. Nikiforov, Detection of Abrupt Changes: Theory and Application, Prentice Hall Information and System Sciences Series, Prentice Hall, Inc., Englewood Cliffs, NJ, 1993.  Google Scholar

[5]

U. Bauer, Ripser: Efficient computation of Vietoris-Rips persistence barcodes, J. Appl. Comput. Topol., 5 (2021), 391-423.  doi: 10.1007/s41468-021-00071-5.  Google Scholar

[6]

M. Belkin and P. Niyogi, Laplacian Eigenmaps for dimensionality reduction and data representation, Neural Computation, 15 (2003), 1373-1396.  doi: 10.1162/089976603321780317.  Google Scholar

[7]

E. Berberich and M. Kerber, Exact arrangements on tori and Dupin cyclides, in Proceedings of the 2008 ACM Symposium on Solid and Physical Modeling, ACM, (2008), 59–66. doi: 10.1145/1364901.1364912.  Google Scholar

[8]

T. Caliński and J. Harabasz, A dendrite method for cluster analysis, Comm. Statist., 3 (1974), 1-27.  doi: 10.1080/03610927408827101.  Google Scholar

[9]

E. J. Candès, Mathematics of sparsity (and a few other things), in Proceedings of the International Congress of Mathematicians—Seoul 2014. Vol. 1, Kyung Moon Sa, Seoul, 2014,235–258.  Google Scholar

[10]

G. Carlsson, Topology and data, Bull. Amer. Math. Soc. (N.S.), 46 (2009), 255-308.  doi: 10.1090/S0273-0979-09-01249-X.  Google Scholar

[11]

D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions, J. Symbolic Comput., 9 (1990), 251-280.  doi: 10.1016/S0747-7171(08)80013-2.  Google Scholar

[12]

D. L. Davies and D. W. Bouldin, A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1 (1979), 224-227.  doi: 10.1109/TPAMI.1979.4766909.  Google Scholar

[13]

R. C. de Amorim and B. Mirkin, Minkowski metric, feature weighting and anomalous cluster initializing in K-Means clustering, Pattern Recognition, 45 (2012), 1061-1075.  doi: 10.1016/j.patcog.2011.08.012.  Google Scholar

[14]

V. de SilvaD. Morozov and M. Vejdemo-Johansson, Persistent cohomology and circular coordinates, Discrete Comput. Geom., 45 (2011), 737-759.  doi: 10.1007/s00454-011-9344-x.  Google Scholar

[15]

David L. Donoho and Ca rrie Grimes, Hessian eigenmaps: Locally linear em-bedding techniques for high-dimensional data, Proceedings of the National Academy of Sciences, 100 (2003), 5591-5596.  doi: 10.1073/pnas.1031596100.  Google Scholar

[16]

M. Elad, Sparse and Redundant Representations. From Theory to Applications in Signal and Image Processing, Springer, New York, 2010. doi: 10.1007/978-1-4419-7011-4.  Google Scholar

[17]

B. T. FasyF. LecciA. RinaldoL. WassermanS. Balakrishnan and A. Singh, Confidence sets for persistence diagrams, Ann. Statist., 42 (2014), 2301-2339.  doi: 10.1214/14-AOS1252.  Google Scholar

[18]

A. Gupta and R. Bowden, Evaluating dimensionality reduction techniques for visual category recognition using rényi entropy, 19th European Signal Processing Conference, IEEE, Barcelona, Spain, 2011. Google Scholar

[19] T. HastieR. Tibshirani and M. Wainwright, Statistical Learning with Sparsity. The Lasso and Generalizations, Monographs on Statistics and Applied Probability, 143, CRC Press, Boca Raton, FL, 2015.   Google Scholar
[20] A. Hatcher, Algebraic Topology, Cambridge University Press, Cambridge, 2002.   Google Scholar
[21]

S. Holmes, Personal communication, 2020. Google Scholar

[22]

D. P. Kingma and J. L. Ba, Adam: A method for stochastic optimization, preprint, arXiv: 1412.6980. Google Scholar

[23]

G. KraemerM. Reichstein and M. D. Mahecha, dimRed and coRanking–-Unifying dimensionality reduction in R, The R Journal, 10 (2018), 342-358.  doi: 10.32614/RJ-2018-039.  Google Scholar

[24]

J. H. Krijthe, Rtsne: T-Distributed Stochastic Neighbor Embedding Using Barnes-Hut Implementation, 2015., Available from: https://github.com/jkrijthe/Rtsne. Google Scholar

[25]

J. A. Lee and M. Verleysen, Quality assessment of dimensionality reduction: Rank-based criteria, Neurocomputing, 72 (2009), 1431-1443.  doi: 10.1016/j.neucom.2008.12.017.  Google Scholar

[26]

W. Lueks, B. Mokbel, M. Biehl and B. Hammer, How to evaluate dimensionality reduction?–Improving the co-ranking matrix, preprint, arXiv: 1110.3917. Google Scholar

[27]

H. Luo and D. Li, Spherical rotation dimension reduction with geometric loss functions, work in progress. Google Scholar

[28]

H. Luo, S. N. MacEachern and M. Peruggia, Asymptotics of lower dimensional zero-density regions, preprint, arXiv: arXiv: 2006.02568. Google Scholar

[29]

L. McInnes, J. Healy and J. Melville, UMAP: Uniform manifold approximation and projection for dimension reduction, preprint, arXiv: 1802.03426. Google Scholar

[30]

B. Michel, A Statistical Approach to Topological Data Analysis, Ph.D thesis, UPMC Université Paris Ⅵ, 2015. Google Scholar

[31]

N. Milosavljević, D. Morozov and P. Škraba, Zigzag persistent homology in matrix multiplication time, Computational Geometry (SCG'11), ACM, New York, (2011), 216–225. doi: 10.1145/1998196.1998229.  Google Scholar

[32]

D. Morozov, Dionysus2: A library for computing persistent homology., Available from: https://github.com/mrzv/dionysus. Google Scholar

[33]

P. NiyogiS. Smale and S. Weinberger, Finding the homology of submanifolds with high confidence from random samples, Discrete Comput. Geom., 39 (2008), 419-441.  doi: 10.1007/s00454-008-9053-2.  Google Scholar

[34]

N. OtterM. A. PorterU. TillmannP. Grindrod and H. A. Harrington, A roadmap for the computation of persistent homology, EPJ Data Science, 6 (2017), 1-38.  doi: 10.1140/epjds/s13688-017-0109-5.  Google Scholar

[35]

E. S. Page, Continuous inspection schemes, Biometrika, 41 (1954), 100-115.  doi: 10.1093/biomet/41.1-2.100.  Google Scholar

[36]

L. Polanco and J. A. Perea, Coordinatizing data with lens spaces and persistent cohomology, preprint, arXiv: 1905.00350. Google Scholar

[37]

M. Robinson, Multipath-dominant, pulsed doppler analysis of rotating blades, preprint, arXiv: 1204.4366. Google Scholar

[38]

M. Robinson, Personal communication, 2019. Google Scholar

[39]

E. Ronchetti, The main contributions of robust statistics to statistical science and a new challenge, METRON, 79 (2021), 127-135.  doi: 10.1007/s40300-020-00185-3.  Google Scholar

[40]

A. Tausz and G. Carlsson, Applications of zigzag persistence to topological data analysis, preprint, arXiv: 1108.3545. Google Scholar

[41]

L. van der Maaten and G. Hinton, Visualizing data using t-SNE, J. Mach. Learning Res., 9(2008), 2579{2605. Available from: http://jmlr.org/papers/v9/vandermaaten08a.html. Google Scholar

[42]

M. Vejdemo-Johansson, G. Carlsson, P. Y. Lum, A. Lehman, G. Singh and T. Ishkhanov, The topology of politics: Voting connectivity in the US House of Representatives, in NIPS 2012 Workshop on Algebraic Topology and Machine Learning, 2012. Google Scholar

[43]

L. VendraminR. J. G. B. Campello and E. R. Hruschka, Relative clustering validity criteria: A comparative overview, Stat. Anal. Data Min., 3 (2010), 209-235.  doi: 10.1002/sam.10080.  Google Scholar

[44]

R. VidalY. Ma and S. S. Sastry, Generalized principal component analysis (GPCA), IEEE Transactions on Pattern Analysis and Machine Intelligence, 27 (2005), 1945-1959.  doi: 10.1109/TPAMI.2005.244.  Google Scholar

[45]

B. WangB. SummaV. Pascucci and M. Vejdemo-Johansson, Branching and circular features in high dimensional data, IEEE Transactions on Visualization and Computer Graphics, 17 (2011), 1902-1911.  doi: 10.1109/TVCG.2011.177.  Google Scholar

[46]

E. Zemel, An O(n) algorithm for the linear multiple choice knapsack problem and related problems, Inform. Process. Lett., 18 (1984), 123-128.  doi: 10.1016/0020-0190(84)90014-0.  Google Scholar

[47]

X. Zhu, Persistent homology: An introduction and a new text representation for natural language processing, in Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, 2013, 1953–1959. Available from: https://www.ijcai.org/Proceedings/13/Papers/288.pdf. Google Scholar

47] with four points $ a = (-1,0.5),b = (1,0.5),c = (1,-0.5),d = (-1,-0.5) $">Figure 1.1.  Example in Section 2.3 of [47] with four points $ a = (-1,0.5),b = (1,0.5),c = (1,-0.5),d = (-1,-0.5) $
Figure 1.2.  The scatter plot, barcode, coordinate plot, and the colormap for the dataset $ X\subset\mathbb{R}^{2} $, which is a dataset of $ 50 $ points equidistantly sampled on a figure-$ 8 $ shape
Figure 2.1.  The dataset $ X\subset\mathbb{R}^{3} $, which is a dataset of $ 150 $ samples on a figure-$ 8 $ shape $ S^{1}\times\{0\}\bigcup\{0\}\times(S^{1}(-1,-1)) $, where $ S^{1}(-1,-1) $ denotes a unit circle centered at $ (-1,-1) $
Figure 2.2.  The dimension reduced data $ X^{cc} $ obtained from circular coordinates based on the Vietoris-Rips complex constructed from $ X $
Figure 2.3.  The PCA representation $ X^{pca} $ from choosing $ 2 $ principal components
Figure 3.1.  Example 1: The $ L_{2} $ smoothed and generalized penalized circular coordinates of the uniformly sampled dataset ($ n = 300 $) from a ring of inner radius $ R = 1.5 $ and width $ d = 1.5 $. The first, second, and the third row correspond to $ \lambda = 0,0.5 $, and 1, respectively
Figure 3.2.  Example 1: The $ L_{1} $ smoothed (first column) and $ L_{2} $ smoothed (second column) circular coordinates of the uniformly sampled dataset from a ring with the same radius $ R = 1.5 $ but different widths $ d = 1,2,7.5 $, corresponding to each row. The first and second columns correspond to $ \lambda = 0 $ and 1, respectively
Figure 3.3.  Functional norms of varying $ \lambda $ coefficient on Example 1: The $ L_1 $ (first row), $ L_2 $ (second row), and mixed norm (third row) for smoothed circular coordinates functions optimized with different choices of $ \lambda $ as in (2.3). The coordinates are computed for the uniformly sampled dataset from a ring with the same radius $ R = 1.5 $ but different widths $ d = 1,2,7.5 $, as in Figure 3.2, corresponding to each column. We also use black vertical dashed lines to delineate the $ \lambda = 0,0.5,1 $ on the log scale
Figure 3.4.  Example 2: The $ L_{2} $ smoothed and generalized penalized circular coordinate (displayed in different rows) of the uniformly sampled dataset ($ n = 100 $) from double rings, both with inner radius $ R = 1.5 $ and width $ d = 0.5 $. The first, second, and the third row correspond to $ \lambda = 0,0.5 $, and 1, respectively
Figure 3.5.  Example 3: The $ L_{2} $ smoothed and generalized penalized circular coordinate (displayed in different rows) of the uniformly sampled dataset ($ n = 300 $) from Dupin cyclides (a.k.a. pinched torus). The first, second, and the third row correspond to $ \lambda = 0,0.5 $, and 1, respectively
Figure 4.1.  The $ S^{1} $ representation obtained from the circular coordinate representation under different penalty functions. The first, second, and the third row correspond to $ \lambda = 0 $, $ 0.5 $, and $ 1 $, respectively
37] plotted against indices (equivalent to the distances of distance-bins). The first, second, and the third row correspond to $ \lambda = 0 $, $ 0.5 $, and 1, respectively. The circular coordinates with generalized penalty function are much sparser compared to the coordinates associated with the $ L_2 $ penalty function, which means that our method captures the periodic pattern better">Figure 4.2.  The $ L_{2} $ smoothed and generalized penalized circular coordinates (displayed in different rows) of the three collections of fan frequency dataset ($ n = 175 $) from [37] plotted against indices (equivalent to the distances of distance-bins). The first, second, and the third row correspond to $ \lambda = 0 $, $ 0.5 $, and 1, respectively. The circular coordinates with generalized penalty function are much sparser compared to the coordinates associated with the $ L_2 $ penalty function, which means that our method captures the periodic pattern better
Figure 4.3.  The $ L_{2} $ smoothed and generalized penalized (mod 1) combined circular coordinates among congressman/woman across party-lines. Each point represents a congressman/woman and the color represents party-lines. The circular coordinates are computed from congress voting datasets from years 1990, 1998, and 2006 (displayed in different rows). The first and the second column correspond to $ \lambda = 0 $ and $ 1 $, respectively. We compute the cluster scores by mapping the combined circular coordinates (summed up by all 1-cocycles with persistence greater than 1) to $ \mathbb{R}^2 $ with the mapping $ x\mapsto(\cos(2\pi x),\sin(2\pi x)) $ to accommodate the circularity.
Figure B.1.  The GPCA representation $ X^{gpca,2} $ and $ X^{gpca,3} $ of the embeddings from the first $ 2 $ principal components of the homogeneous polynomials of degree $ 2 $ and $ 3 $, respectively
Figure C.1.  Evaluation of dimension reduction results obtained from different NLDR methods with the congress voting dataset of year 1990. We display the coranking matrices of PCA and t-SNE in the first row, and the coranking matrices of UMAP and Laplacian eigenmap in the second row. We display the coranking matrices of circular coordinates with penalty functions $ L_{1} $, elastic norm, and $ L_{2} $ in the third row
Figure C.2.  Evaluation of dimension reduction results obtained from different NLDR methods with the congress voting dataset of year 1998. We display the coranking matrices of PCA and t-SNE in the first row, and the coranking matrices of UMAP and Laplacian eigenmap in the second row. We display the coranking matrices of circular coordinates with penalty functions $ L_{1} $, elastic norm, and $ L_{2} $ in the third row
Figure C.3.  Evaluation of dimension reduction results obtained from different NLDR methods with the congress voting dataset of year 2006. We display the coranking matrices of PCA and t-SNE in the first row, and the coranking matrices of UMAP and Laplacian eigenmap in the second row. We display the coranking matrices of circular coordinates with penalty functions $ L_{1} $, elastic norm and $ L_{2} $ in the third row
Figure D.1.  Example 5: The $ L_{2} $ smoothed and generalized penalized circular coordinates of the Jacobian rejection sampled dataset ($ n = 300 $) from a ring with fixed width (Jacobian rejection sampling). The first, second, and the third row correspond to $ \lambda = 0 $, $ 0.5 $, and $ 1 $, respectively
Figure D.2.  Example 6: The $ L_{2} $ smoothed and generalized penalized circular coordinates of the Jacobian rejection sampled dataset ($ n = 300 $) from a Dupin cyclide with $ r = 2 $, $ R = 1.5 $ as in Section 3.4. The first, second, and the third row correspond to $ \lambda = 0 $, $ 0.5 $, and 1, respectively
Figure E.1.  (top) Barcode for a simulated example of 150 uniformly sampled points from an annulus. (bottom) Resulting circular coordinates computed using different thresholds along the filtration for longest persisting cocycle represented as color of the points
Figure E.2.  (top) Barcode for a simulated example of 150 uniformly sampled points from an annulus. (center) Resulting circular coordinates computed using different thresholds along the filtration for longest persisting cocycle represented as color of the points. (bottom) Resulting circular coordinates plotted against the angle theta between the respective point and the $ x = 0 $ axis with values colored the same way as the center row
Fig. E.2. The blue bars represent a box plot for the circular coordinate values for the circular coordinates relative to the points represented by angle theta">Figure E.3.  Comparison of 100 circular coordinates computed with threshold varying between the birth and death of the longest cocycle in the example Fig. E.2. The blue bars represent a box plot for the circular coordinate values for the circular coordinates relative to the points represented by angle theta
[1]

George Siopsis. Quantum topological data analysis with continuous variables. Foundations of Data Science, 2019, 1 (4) : 419-431. doi: 10.3934/fods.2019017

[2]

Tyrus Berry, Timothy Sauer. Consistent manifold representation for topological data analysis. Foundations of Data Science, 2019, 1 (1) : 1-38. doi: 10.3934/fods.2019001

[3]

Takashi Hara and Gordon Slade. The incipient infinite cluster in high-dimensional percolation. Electronic Research Announcements, 1998, 4: 48-55.

[4]

Wayne B. Hayes, Kenneth R. Jackson, Carmen Young. Rigorous high-dimensional shadowing using containment: The general case. Discrete & Continuous Dynamical Systems, 2006, 14 (2) : 329-342. doi: 10.3934/dcds.2006.14.329

[5]

Johannes Lankeit, Yulan Wang. Global existence, boundedness and stabilization in a high-dimensional chemotaxis system with consumption. Discrete & Continuous Dynamical Systems, 2017, 37 (12) : 6099-6121. doi: 10.3934/dcds.2017262

[6]

Hansol Park. Generalization of the Winfree model to the high-dimensional sphere and its emergent dynamics. Discrete & Continuous Dynamical Systems, 2021  doi: 10.3934/dcds.2021134

[7]

Gaku Hoshino. Dissipative nonlinear schrödinger equations for large data in one space dimension. Communications on Pure & Applied Analysis, 2020, 19 (2) : 967-981. doi: 10.3934/cpaa.2020044

[8]

Junying Hu, Xiaofei Qian, Jun Pei, Changchun Tan, Panos M. Pardalos, Xinbao Liu. A novel quality prediction method based on feature selection considering high dimensional product quality data. Journal of Industrial & Management Optimization, 2021  doi: 10.3934/jimo.2021099

[9]

Yaxian Xu, Ajay Jasra. Particle filters for inference of high-dimensional multivariate stochastic volatility models with cross-leverage effects. Foundations of Data Science, 2019, 1 (1) : 61-85. doi: 10.3934/fods.2019003

[10]

Chao Wang, Zhien Li, Ravi P. Agarwal. Hyers-Ulam-Rassias stability of high-dimensional quaternion impulsive fuzzy dynamic equations on time scales. Discrete & Continuous Dynamical Systems - S, 2021  doi: 10.3934/dcdss.2021041

[11]

Prashant Shekhar, Abani Patra. Hierarchical approximations for data reduction and learning at multiple scales. Foundations of Data Science, 2020, 2 (2) : 123-154. doi: 10.3934/fods.2020008

[12]

Daniel Amin, Mikael Vejdemo-Johansson. Intrinsic disease maps using persistent cohomology. Foundations of Data Science, 2021  doi: 10.3934/fods.2021008

[13]

Pooja Bansal, Aparna Mehra. Integrated dynamic interval data envelopment analysis in the presence of integer and negative data. Journal of Industrial & Management Optimization, 2021  doi: 10.3934/jimo.2021023

[14]

Giuliano Lazzaroni, Mariapia Palombaro, Anja Schlömerkemper. Rigidity of three-dimensional lattices and dimension reduction in heterogeneous nanowires. Discrete & Continuous Dynamical Systems - S, 2017, 10 (1) : 119-139. doi: 10.3934/dcdss.2017007

[15]

A Voutilainen, Jari P. Kaipio. Model reduction and pollution source identification from remote sensing data. Inverse Problems & Imaging, 2009, 3 (4) : 711-730. doi: 10.3934/ipi.2009.3.711

[16]

Aiwan Fan, Qiming Wang, Joyati Debnath. A high precision data encryption algorithm in wireless network mobile communication. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1327-1340. doi: 10.3934/dcdss.2019091

[17]

C. Burgos, J.-C. Cortés, L. Shaikhet, R.-J. Villanueva. A delayed nonlinear stochastic model for cocaine consumption: Stability analysis and simulation using real data. Discrete & Continuous Dynamical Systems - S, 2021, 14 (4) : 1233-1244. doi: 10.3934/dcdss.2020356

[18]

Zhouchen Lin. A review on low-rank models in data analysis. Big Data & Information Analytics, 2016, 1 (2&3) : 139-161. doi: 10.3934/bdia.2016001

[19]

Pankaj Sharma, David Baglee, Jaime Campos, Erkki Jantunen. Big data collection and analysis for manufacturing organisations. Big Data & Information Analytics, 2017, 2 (2) : 127-139. doi: 10.3934/bdia.2017002

[20]

Runqin Hao, Guanwen Zhang, Dong Li, Jie Zhang. Data modeling analysis on removal efficiency of hexavalent chromium. Mathematical Foundations of Computing, 2019, 2 (3) : 203-213. doi: 10.3934/mfc.2019014

 Impact Factor: 

Article outline

Figures and Tables

[Back to Top]