Statistical depths provide a fundamental generalization of quantiles and medians to data in higher dimensions. This paper proposes a new type of globally defined statistical depth, based upon control theory and eikonal equations, which measures the smallest amount of probability density that has to be passed through in a path to locations on the boundary of the support of the distribution or spatial infinity for distributions with unbounded support. This depth is easy to interpret and compute, expressively captures multi-modal behavior, and extends naturally to data that is non-Euclidean. We prove various properties of this depth, and provide discussion of computational considerations. In particular, we demonstrate that this notion of depth is robust under an approximate isometrically constrained adversarial model, a property which is not enjoyed by the Tukey depth. Finally we give some illustrative examples in the context of two-dimensional mixture models and MNIST.
Citation: |
Figure 1. The figure shows, for $ \phi(s) = s $, the level sets of the eikonal depth of a distribution whose densities have compact support (uniform density on a square in Figure 1a) and support on all of $ \mathbb{R}^d $ (a mixture of Gaussians in Figure 1b). See Example 5 for more details on mixtures of Gaussian distributions
Figure 2. For $ \phi(s) = s^{\alpha} $, the figures show the effect of $ \alpha $ on the eikonal depth of a Gaussian mixture. Figure 2a shows the probability density for the Gaussian mixture. Figures 2b, 2c, and 2d show some level sets of the eikonal depth (for $ \alpha = 0.5 $, $ \alpha = 1 $, and $ \alpha = 2 $) of the probability distribution with density as depicted in 2a
Figure 4. Figure 4a shows some level sets of the eikonal depth for a Gaussian mixture model with means that are four standard deviations apart. The contribution of any of the components is negligible at two standard deviations from the other mean. Figure 4b shows some level sets for the eikonal depth for a Gaussian mixture model with means that are two standard derivations apart
Figure 6. The figure shows some representative points belonging to different level sets of the eikonal depth on the elements in MNIST labeled as 4's. The number of representatives in each depth range indicates the proportion of 4's in each range. We can observe that points lying deep in the distribution seem to be neat examples of a hand-written '4'
[1] | E. Arias-Castro, D. Mason and B. Pelletier, On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm, Journal of Machine Learning Research, 17 (2016), 1-28. |
[2] | A. Baldominos, Y. Saez and P. Isasi, A survey of handwritten character recognition with MNIST and EMNIST, Applied Sciences, 9 (2019), 3169. doi: 10.3390/app9153169. |
[3] | M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Springer Science & Business Media, 2008. doi: 10.1007/978-0-8176-4755-1. |
[4] | G. Barles and P. E. Souganidis, Convergence of approximation schemes for fully nonlinear second order equations, Asymptotic Analysis, 4 (1991), 271-283. doi: 10.3233/ASY-1991-4305. |
[5] | V. Barnett, The ordering of multivariate data, Journal of the Royal Statistical Society: Series A (General), 139 (1976), 318-355. doi: 10.2307/2344839. |
[6] | L. Bungert, N. García Trillos and R. Murray, The geometry of adversarial training in binary classification, Information and Inference: A Journal of the IMA, 12 (2023), 921-968. doi: 10.1093/imaiai/iaac029. |
[7] | D. Burago, Y. Burago and S. Ivanov, A Course in Metric Geometry, volume 33, American Mathematical Soc., 2001. doi: 10.1090/gsm/033. |
[8] | J. Calder, Lecture Notes on Viscosity Solutions, Lecture Notes, 2018. |
[9] | J. Calder, B. Cook, M. Thorpe and D. Slepcev, Poisson learning: Graph based semi-supervised learning at very low label rates, In International Conference on Machine Learning, PMLR, 2020, 1306-1316. |
[10] | J. Calder, S. Esedoglu and A. O. Hero, A Hamilton–Jacobi equation for the continuum limit of nondominated sorting, SIAM Journal on Mathematical Analysis, 46 (2014), 603-638. doi: 10.1137/13092842X. |
[11] | J. Calder and M. Ettehad, Hamilton-Jacobi equations on graphs with applications to semi-supervised learning and data depth, The Journal of Machine Learning Research, 23 (2022), 14267-14328. |
[12] | J. Calder, S. Park and D. Slepčev, Boundary estimation from point clouds: Algorithms, guarantees and applications, Journal of Scientific Computing, 92 (2022), Paper No. 56, 59 pp. doi: 10.1007/s10915-022-01894-9. |
[13] | J. Calder and C. K. Smart, The limit shape of convex hull peeling, Duke Mathematical Journal, 169 (2020), 2079-2124. doi: 10.1215/00127094-2020-0013. |
[14] | P. Cannarsa and C. Sinestrari, Semiconcave functions, Hamilton-Jacobi Equations, and Optimal Control, volume 58., Springer Science & Business Media, 2004. |
[15] | E. Carrizosa, A characterization of halfspace depth, Journal of Multivariate Analysis, 58 (1996), 21-26. doi: 10.1006/jmva.1996.0037. |
[16] | R. Chen and I. Ch. Paschalidis, Distributionally robust learning, Foundations and Trends® in Optimization, 4 (2020), 1-243. doi: 10.1561/2400000026. |
[17] | V. Chernozhukov, A. Galichon, M. Hallin and M. Henry, Monge–Kantorovich depth, quantiles, ranks and signs, Annals of Statistics, 45 (2017), 223-256. doi: 10.1214/16-AOS1450. |
[18] | B. Cook and J. Calder, Rates of convergence for the continuum limit of nondominated sorting, SIAM Journal on Mathematical Analysis, 54 (2022), 872-911. doi: 10.1137/20M1344901. |
[19] | M. G. Crandall and P.-L. Lions, Viscosity solutions of Hamilton-Jacobi equations, Transactions of the American Mathematical Society, 277 (1983), 1-42. doi: 10.1090/S0002-9947-1983-0690039-8. |
[20] | K. Dalal, Counting the onion, Random Structures & Algorithms, 24 (2004), 155-165. doi: 10.1002/rsa.10114. |
[21] | X. Desquesnes, A. Elmoataz and O. Lézoray, Eikonal equation adaptation on weighted graphs: Fast geometric diffusion process for local and non-local image and data processing, Journal of Mathematical Imaging and Vision, 46 (2013), 238-257. doi: 10.1007/s10851-012-0380-9. |
[22] | L. Devroye and G. Lugosi, Combinatorial Methods in Density Estimation, Springer Science & Business Media, 2001. doi: 10.1007/978-1-4613-0125-7. |
[23] | R. Dyckerhoff, K. Mosler and G. Koshevoy, Zonoid data depth: Theory and computation, In COMPSTAT, Springer, 1996, 235-240. doi: 10.1007/978-3-642-46992-3_26. |
[24] | L. C. Evans, Partial Differential Equations, volume 19., American Mathematical Society, 1998. doi: 10.1090/gsm/019. |
[25] | J. Fadili, N. Forcadel, T. T. Nguyen and R. Zantout, Limits and consistency of nonlocal and graph approximations to the Eikonal equation, IMA Journal of Numerical Analysis, 43 (2023), 3685-3728. doi: 10.1093/imanum/drac082. |
[26] | R. Fraiman, J. Meloche, L. García-Escudero, A. Gordaliza, X. He, R. Maronna, V. J. Yohai, S. J. Sheather, J. W, McKean, C. G. Small, et al., Multivariate L-estimation, Test, 8 (1999), 255-317. doi: 10.1007/BF02595872. |
[27] | N. García Trillos, M. Gerlach, M. Hein and D. Slepčev, Error estimates for spectral convergence of the graph Laplacian on random geometric graphs toward the Laplace–Beltrami operator, Foundations of Computational Mathematics, 20 (2020), 827-887. doi: 10.1007/s10208-019-09436-w. |
[28] | N. García Trillos, P. He and C. Li, Large sample spectral analysis of graph-based multi-manifold clustering, Journal of Machine Learning Research, 24 (2023), 1-71. |
[29] | N. García Trillos and R. Murray, A new analytical approach to consistency and overfitting in regularized empirical risk minimization, European Journal of Applied Mathematics, 28 (2017), 886-921. doi: 10.1017/S0956792517000201. |
[30] | N. García Trillos and R. W. Murray, A maximum principle argument for the uniform convergence of graph Laplacian regressors, SIAM Journal on Mathematics of Data Science, 2 (2020), 705-739. doi: 10.1137/19M1245372. |
[31] | N. García Trillos and R. Murray, Adversarial classification: Necessary conditions and geometric flows, Journal of Machine Learning Research, 23 (2022), 1-38. |
[32] | N. García Trillos and D. Slepčev, Continuum limit of total variation on point clouds, Archive for Rational Mechanics and Analysis, 220 (2016), 193-241. doi: 10.1007/s00205-015-0929-z. |
[33] | E. Giné and A. Guillou, Rates of strong uniform consistency for multivariate kernel density estimators, Annales de l'Institut Henri Poincare (B) Probability and Statistics, 38 (2002), 907-921. doi: 10.1016/S0246-0203(02)01128-7. |
[34] | I. Goodfellow, J. Shlens and C. Szegedy, Explaining and harnessing adversarial examples, In International Conference on Learning Representations, 2015. |
[35] | J. L. Hodges, A bivariate sign test, The Annals of Mathematical Statistics, 26 (1955), 523-527. doi: 10.1214/aoms/1177728498. |
[36] | H. Hotelling, Stability in competition, Economic Journal, 39 (1929), 41-57. doi: 10.2307/2224214. |
[37] | R. Jörnsten, Clustering and classification based on the $L^1$ data depth, Journal of Multivariate Analysis, 90 (2004), 67-89. doi: 10.1016/j.jmva.2004.02.013. |
[38] | N. Katzourakis, An Introduction to Viscosity Solutions for Fully Nonlinear PDE with Applications to Calculus of Variations in $L^\infty$, Springer, 2015. doi: 10.1007/978-3-319-12829-0. |
[39] | G. A. Koshevoy, The Tukey depth characterizes the atomic measure, Journal of Multivariate Analysis, 83 (2002), 360-364. doi: 10.1006/jmva.2001.2052. |
[40] | G. A. Koshevoy, Lift-zonoid and multivariate depths, In Developments in Robust Statistics, Springer, 2003,194-202. |
[41] | M. Lewicka and Y. Peres, Which domains have two-sided supporting unit spheres at every boundary point?, Expositiones Mathematicae, 38 (2020), 548-558. doi: 10.1016/j.exmath.2019.01.003. |
[42] | R. Y. Liu, J. M. Parelius and K. Singh, Multivariate analysis by data depth: Descriptive statistics, graphics and inference, (with discussion and a rejoinder by Liu and Singh), The Annals of Statistics, 27 (1999), 783-858. doi: 10.1214/aos/1018031259. |
[43] | W. S. Lok and S. M. S. Lee, A new statistical depth function with applications to multimodal data, Journal of Nonparametric Statistics, 23 (2011), 617-631. doi: 10.1080/10485252.2011.553953. |
[44] | A. Madry, A. Makelov, L. Schmidt, D. Tsipras and A. Vladu, Towards deep learning models resistant to adversarial attacks, 2019. |
[45] | P. C. Mahalanobis, On the generalized distance in statistics, National Institute of Science of India, 1936. |
[46] | D. M. Mason, Proving consistency of non-standard kernel estimators, Statistical Inference for Stochastic Processes, 15 (2012), 151-176. doi: 10.1007/s11203-012-9068-4. |
[47] | D. M. Mason and J. W. H. Swanepoel, A general result on the uniform in bandwidth consistency of kernel-type function estimators, Test, 20 (2011), 72-94. doi: 10.1007/s11749-010-0188-0. |
[48] | J.-C. Massé, Asymptotics for the Tukey depth process, with an application to a multivariate trimmed mean, Bernoulli, 10 (2004), 397-419. doi: 10.3150/bj/1089206404. |
[49] | M. Molina-Fructuoso and R. Murray, Tukey depths and Hamilton–Jacobi differential equations, SIAM Journal on Mathematics of Data Science, 4 (2022), 604-633. doi: 10.1137/21M1411998. |
[50] | S.-M. Moosavi-Dezfooli, A. Fawzi, J. Uesato and P. Frossard, Robustness via curvature regularization, and vice versa, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, 9078-9086. doi: 10.1109/CVPR.2019.00929. |
[51] | K. Mosler and P. Mozharovskyi, Choosing among notions of multivariate depth statistics, Statistical Science, 37 (2022), 348-368. doi: 10.1214/21-sts827. |
[52] | S. Nagy, The halfspace depth characterization problem, In Conference of the International Society for Non-Parametric Statistics, Springer, (2018), 379-389. doi: 10.1007/978-3-030-57306-5_34. |
[53] | S. Nagy, Halfspace depth does not characterize probability distributions, Statistical Papers, 62 (2021), 1135-1139. doi: 10.1007/s00362-019-01130-x. |
[54] | S. Nagy, C. Schuett and E. M. Werner, Halfspace depth and floating body, Statistics Surveys, 13 (2019), 52-118. doi: 10.1214/19-ss123. |
[55] | H. Oja, Descriptive statistics for multivariate distributions, Statistics & Probability Letters, 1 (1983), 327-332. doi: 10.1016/0167-7152(83)90054-8. |
[56] | M. S. Pydi and V. Jog, Adversarial risk via optimal transport and optimal couplings, In International Conference on Machine Learning, PMLR, 2020, 7814-7823. |
[57] | P. J. Rousseeuw, I. Ruts and W. Tukey, The bagplot: A bivariate boxplot, The American Statistician, 53 (1999), 382-387. doi: 10.1080/00031305.1999.10474494. |
[58] | R. Serfling, Equivariance and invariance properties of multivariate quantile and related functions, and the role of standardisation, Journal of Nonparametric Statistics, 22 (2010), 915-936. doi: 10.1080/10485250903431710. |
[59] | J. A. Sethian, Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science, volume 3. Cambridge university press, 1999. |
[60] | J. A. Sethian and A. Vladimirsky, Fast methods for the eikonal and related Hamilton–Jacobi equations on unstructured meshes, Proceedings of the National Academy of Sciences, 97 (2000), 5699-5703. doi: 10.1073/pnas.090060097. |
[61] | B. W. Silverman, Density Estimation for Statistics and Data Analysis, volume 26., CRC press, 1986. |
[62] | D. Slepčev and M. Thorpe, Analysis of p-Laplacian regularization in semisupervised learning, SIAM Journal on Mathematical Analysis, 51 (2019), 2085-2120. doi: 10.1137/17M115222X. |
[63] | C. Small, Multidimensional medians arising from geodesics on graphs, The Annals of Statistics, 25 (1997), 478-494. doi: 10.1214/aos/1031833660. |
[64] | A. Struyf and P. J. Rousseeuw, Halfspace depth and regression depth characterize the empirical distribution, Journal of Multivariate Analysis, 69 (1999), 135-153. doi: 10.1006/jmva.1998.1804. |
[65] | J. Tukey, Mathematics and picturing of data, Proceedings of ICM, 1975,523-531. |
[66] | R. Vaughn, T. Berry and H. Antil, Diffusion maps for embedded manifolds with boundary with applications to PDEs, Applied and Computational Harmonic Analysis, 68 (2024), 101593. doi: 10.1016/j.acha.2023.101593. |
[67] | J. Wang and R. Serfling, Nonparametric multivariate kurtosis and tailweight measures, J. Nonparametric Statistics, 17 (2005), 441-456. doi: 10.1080/10485250500039130. |
[68] | Y. Zuo, Projection-based depth functions and associated medians, The Annals of Statistics, 31 (2003), 1460-1490. doi: 10.1214/aos/1065705115. |
[69] | Y. Zuo, H. Cui and X. He, On the Stahel-Donoho estimator and depth-weighted means of multivariate data, The Annals of Statistics, 32 (2004), 167-188. doi: 10.1214/aos/1079120132. |
[70] | Y. Zuo and R. Serfling, General notions of statistical depth function, Annals of Statistics, 28 (2000), 461-482. doi: 10.1214/aos/1016218226. |
The figure shows, for
For
The figure shows some level sets of the eikonal depth of a Gaussian mixture with
Figure 4a shows some level sets of the eikonal depth for a Gaussian mixture model with means that are four standard deviations apart. The contribution of any of the components is negligible at two standard deviations from the other mean. Figure 4b shows some level sets for the eikonal depth for a Gaussian mixture model with means that are two standard derivations apart
The figure shows examples of the eikonal depth of some densities on manifolds. Figure 5c shows only the points in 5b with
The figure shows some representative points belonging to different level sets of the eikonal depth on the elements in MNIST labeled as 4's. The number of representatives in each depth range indicates the proportion of 4's in each range. We can observe that points lying deep in the distribution seem to be neat examples of a hand-written '4'
Figure 7a shows some representative points belonging to different level sets of the eikonal depth on the elements in MNIST labeled as 4's when the boundary of this set is defined using an approximation to the density. Figure 7b shows some of the elements on this boundary