September  2019, 1(3): 329-387. doi: 10.3934/fods.2019015

Randomized learning of the second-moment matrix of a smooth function

1. 

Institute of Electrical Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland

2. 

Department of Electrical Engineering, Colorado School of Mines, Denver, CO 80401, USA

3. 

Departments of Statistics and Computer Science, Rutgers University, Piscataway, NJ 08854, USA

4. 

Department of Computer Science, University of Colorado Boulder, Boulder, CO 80309, USA

* Corresponding author: Armin Eftekhari

Published  September 2019

Fund Project: AE was partially supported by the Alan Turing Institute under the EPSRC grant EP/N510129/1 and also by the Turing Seed Funding grant SF019. MBW was partially supported by NSF grant CCF-1409258 and NSF CAREER grant CCF-1149225. PGC was partially supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under award DE-SC0011077 and the Defense Advanced Research Projects Agency's Enabling Quantification of Uncertainty in Physical Systems.

Consider an open set $ \mathbb{D}\subseteq\mathbb{R}^n $, equipped with a probability measure $ \mu $. An important characteristic of a smooth function $ f:\mathbb{D}\rightarrow\mathbb{R} $ is its second-moment matrix $ \Sigma_{\mu}: = \int \nabla f(x) \nabla f(x)^* \mu(dx) \in\mathbb{R}^{n\times n} $, where $ \nabla f(x)\in\mathbb{R}^n $ is the gradient of $ f(\cdot) $ at $ x\in\mathbb{D} $ and $ * $ stands for transpose. For instance, the span of the leading $ r $ eigenvectors of $ \Sigma_{\mu} $ forms an active subspace of $ f(\cdot) $, which contains the directions along which $ f(\cdot) $ changes the most and is of particular interest in ridge approximation. In this work, we propose a simple algorithm for estimating $ \Sigma_{\mu} $ from random point evaluations of $ f(\cdot) $ without imposing any structural assumptions on $ \Sigma_{\mu} $. Theoretical guarantees for this algorithm are established with the aid of the same technical tools that have proved valuable in the context of covariance matrix estimation from partial measurements.

Citation: Armin Eftekhari, Michael B. Wakin, Ping Li, Paul G. Constantine. Randomized learning of the second-moment matrix of a smooth function. Foundations of Data Science, 2019, 1 (3) : 329-387. doi: 10.3934/fods.2019015
References:
[1]

R. Adamczak, Logarithmic Sobolev inequalities and concentration of measure for convex functions and polynomial chaoses, Bull. Pol. Acad. Sci. Math., 53 (2005), 221-238.  doi: 10.4064/ba53-2-10.  Google Scholar

[2]

F. P. Anaraki and S. Hughes, Memory and computation efficient PCA via very sparse random projections, in Proceedings of the International Conference on Machine Learning (ICML-14), 2014, 1341–1349. Google Scholar

[3]

M. Azizyan, A. Krishnamurthy and A. Singh, Extreme compressive sampling for covariance estimation, IEEE Trans. Inform. Theory, 64 (2018), 7613–7635, arXiv: 1506.00898. doi: 10.1109/TIT.2018.2871077.  Google Scholar

[4]

D. S. Bernstein, Matrix Mathematics: Theory, Facts, and Formulas (Second Edition), Princeton reference, Princeton University Press, 2009, https://books.google.co.uk/books?id=x7isojLkDTcC. doi: 10.1515/9781400833344.  Google Scholar

[5]

I. Bogunovic, V. Cevher, J. Haupt and J. Scarlett, Active learning of self-concordant like multi-index functions, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2015, 2189–2193. doi: 10.1109/ICASSP.2015.7178359.  Google Scholar

[6]

T. T. Cai and A. Zhang, ROP: Matrix recovery via rank-one projections, The Annals of Statistics, 43 (2015), 102-138.  doi: 10.1214/14-AOS1267.  Google Scholar

[7]

E. J. Candes, Ridgelets: Theory and applications, PhD thesis, Stanford University, 1998.  Google Scholar

[8]

Y. ChenY. Chi and A. J. Goldsmith, Exact and stable covariance estimation from quadratic sampling via convex programming, IEEE Transactions on Information Theory, 61 (2015), 4034-4059.  doi: 10.1109/TIT.2015.2429594.  Google Scholar

[9]

A. CohenI. DaubechiesR. DeVoreG. Kerkyacharian and D. Picard, Capturing ridge functions in high dimensions from point queries, Constructive Approximation, 35 (2012), 225-243.  doi: 10.1007/s00365-011-9147-6.  Google Scholar

[10]

P. Constantine and D. Gleich, Computing active subspaces with Monte Carlo, arXiv preprint, arXiv: 1408.0545. Google Scholar

[11]

P. G. Constantine, Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies, SIAM, Philadelphia, 2015, https://books.google.com/books?id=TOJ9BwAAQBAJ.  Google Scholar

[12]

P. G. Constantine, A. Eftekhari and R. Ward, A near-stationary subspace for ridge approximation, Computer Methods in Applied Mechanics and Engineering, 326 (2017), 402–421, arXiv: 1606.01929. doi: 10.1016/j.cma.2017.07.038.  Google Scholar

[13]

R. D. Cook, Using dimension-reduction subspaces to identify important inputs in models of physical systems, in Proceedings of the section on Physical and Engineering Sciences, American Statistical Association Alexandria, VA, 1994, 18–25. Google Scholar

[14]

G. DasarathyP. ShahB. N. Bhaskar and R. D. Nowak, Sketching sparse matrices, covariances, and graphs via tensor products, IEEE Transactions on Information Theory, 61 (2015), 1373-1388.  doi: 10.1109/TIT.2015.2391251.  Google Scholar

[15]

R. DeVoreG. Petrova and P. Wojtaszczyk, Approximation of functions of few variables in high dimensions, Constructive Approximation, 33 (2011), 125-143.  doi: 10.1007/s00365-010-9105-8.  Google Scholar

[16]

D. L. Donoho and I. M. Johnstone, Projection-based approximation and a duality with kernel methods, The Annals of Statistics, 17 (1989), 58-106.  doi: 10.1214/aos/1176347004.  Google Scholar

[17]

A. Eftekhari, L. Balzano and M. B. Wakin, What to expect when you are expecting on the Grassmannian, IEEE Signal Processing Letters, 24 (2017), 872–876, arXiv: 1611.07216. doi: 10.1109/LSP.2017.2684784.  Google Scholar

[18]

A. Eftekhari, M. B. Wakin and R. A. Ward, MC$^2$: A two-phase algorithm for leveraged matrix completion, Inf. Inference, 7 (2018), 581–604, arXiv: 1609.01795. doi: 10.1093/imaiai/iax020.  Google Scholar

[19]

M. FornasierK. Schnass and J. Vybiral, Learning functions of few arbitrary linear parameters in high dimensions, Foundations of Computational Mathematics, 12 (2012), 229-262.  doi: 10.1007/s10208-012-9115-y.  Google Scholar

[20]

J. Friedman, T. Hastie and R. Tibshirani, The Elements of Statistical Learning, vol. 1, Springer series in statistics Springer, Berlin, 2001. doi: 10.1007/978-0-387-21606-5.  Google Scholar

[21]

J. FriedmanT. Hastie and R. Tibshirani, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9 (2008), 432-441.  doi: 10.1093/biostatistics/kxm045.  Google Scholar

[22]

J. H. Friedman and W. Stuetzle, Projection pursuit regression, Journal of the American statistical Association, 76 (1981), 817-823.  doi: 10.1080/01621459.1981.10477729.  Google Scholar

[23]

K. FukumizuF. R. Bach and M. I. Jordan, Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces, The Journal of Machine Learning Research, 5 (2004), 73-99.   Google Scholar

[24]

S. Gaiffas and G. Lecue, Optimal rates and adaptation in the single-index model using aggregation, Electronic Journal of Statistics, 1 (2007), 538-573.  doi: 10.1214/07-EJS077.  Google Scholar

[25]

A. T. Glaws, P. G. Constantine and R. D. Cook, Inverse regression for ridge recovery, arXiv preprint, arXiv: 1702.02227. Google Scholar

[26]

G. K. Golubev, Asymptotic minimax estimation of regression in the additive model, Problemy Peredachi Informatsii, 28 (1992), 3-15.   Google Scholar

[27]

A. Gonen, D. Rosenbaum, Y. Eldar and S. Shalev-Shwartz, The sample complexity of subspace learning with partial information, J. Mach. Learn. Res., 17 (2016), Paper No. 52, 21 pp. arXiv: 1402.4844.  Google Scholar

[28]

D. Gross, Recovering low-rank matrices from few coefficients in any basis, IEEE Transactions on Information Theory, 57 (2011), 1548-1566.  doi: 10.1109/TIT.2011.2104999.  Google Scholar

[29]

N. HalkoP. G. Martinsson and J. A. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Review, 53 (2011), 217-288.  doi: 10.1137/090771806.  Google Scholar

[30]

T. J. Hastie and R. J. Tibshirani, Generalized Additive Models, Chapman and Hall/CRC Monographs on Statistics and Applied Probability, Taylor and Francis, 1990, https://books.google.com/books?id=qa29r1Ze1coC.  Google Scholar

[31]

J. HauptR. M. Castro and R. Nowak, Distilled sensing: Adaptive sampling for sparse detection and estimation, IEEE Transactions on Information Theory, 57 (2011), 6222-6235.  doi: 10.1109/TIT.2011.2162269.  Google Scholar

[32]

M. HristacheA. JuditskyJ. Polzehl and V. Spokoiny, Structure adaptive approach for dimension reduction, The Annals of Statistics, 29 (2001), 1537-1566.  doi: 10.1214/aos/1015345954.  Google Scholar

[33]

P. J. Huber, Projection pursuit, The annals of Statistics, 13 (1985), 435-475.  doi: 10.1214/aos/1176349519.  Google Scholar

[34]

A. B. JuditskyO. V. Lepski and A. B. Tsybakov, Nonparametric estimation of composite functions, The Annals of Statistics, 37 (2009), 1360-1404.  doi: 10.1214/08-AOS611.  Google Scholar

[35]

S. Keiper, Analysis of generalized ridge functions in high dimensions, in International Conference on Sampling Theory and Applications (SampTA), IEEE, 2015,259–263. doi: 10.1109/SAMPTA.2015.7148892.  Google Scholar

[36]

M. Kolar and E. P. Xing, Consistent covariance selection from data with missing values, in Proceedings of the International Conference on Machine Learning (ICML-12), 2012,551–558. Google Scholar

[37]

A. Krishnamurthy, M. Azizyan and A. Singh, Subspace learning from extremely compressed measurements, arXiv preprint, arXiv: 1404.0751. Google Scholar

[38]

R. Lam, O. Zahm, Y. Marzouk and K. Willcox, Multifidelity dimension reduction via active subspaces, arXiv preprint, arXiv: 1809.05567. Google Scholar

[39]

K. C. Li, Sliced inverse regression for dimension reduction, Journal of the American Statistical Association, 86 (1991), 316-327.  doi: 10.1080/01621459.1991.10475035.  Google Scholar

[40]

E. LibertyF. WoolfeP. G. MartinssonV. Rokhlin and M. Tygert, Randomized algorithms for the low-rank approximation of matrices, Proceedings of the National Academy of Sciences, 104 (2007), 20167-20172.  doi: 10.1073/pnas.0709640104.  Google Scholar

[41]

P. L. Loh and M. J. Wainwright, High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity, Ann. Statist., 40 (2012), 1637-1664.  doi: 10.1214/12-AOS1018.  Google Scholar

[42]

K. Lounici, High-dimensional covariance matrix estimation with missing observations, Bernoulli, 20 (2014), 1029-1058.  doi: 10.3150/12-BEJ487.  Google Scholar

[43]

E. Novak and H. Woźniakowski, Tractability of Multivariate Problems: Standard information for functionals, vol. 12, European Mathematical Society, 2010. doi: 10.4171/084.  Google Scholar

[44]

F. W. J. Olver, NIST Handbook of Mathematical Functions, Cambridge University Press, 2010. Google Scholar

[45]

A. Pinkus, Approximation theory of the MLP model in neural networks, Acta Numerica, 8 (1999), 143-195.  doi: 10.1017/S0962492900002919.  Google Scholar

[46]

A. Pinkus, Ridge Functions, Cambridge Tracts in Mathematics, Cambridge University Press, 2015, https://books.google.com/books?id=dtMmCgAAQBAJ. doi: 10.1017/CBO9781316408124.  Google Scholar

[47]

F. Pourkamali-Anaraki, Estimation of the sample covariance matrix from compressive measurements, IET Signal Processing, 10 (2016), p1089, arXiv: 1512.08887. doi: 10.1049/iet-spr.2016.0169.  Google Scholar

[48]

C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, The MIT Press, 2006.  Google Scholar

[49]

P. RavikumarM. J. WainwrightG. Raskutti and B. Yu, High-dimensional covariance estimation by minimizing $l_1$-penalized log-determinant divergence, Electronic Journal of Statistics, 5 (2011), 935-980.  doi: 10.1214/11-EJS631.  Google Scholar

[50]

B. Recht, A simpler approach to matrix completion, The Journal of Machine Learning Research, 12 (2011), 3413-3430.   Google Scholar

[51]

B. RechtM. Fazel and P. A. Parrilo, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Review, 52 (2010), 471-501.  doi: 10.1137/070697835.  Google Scholar

[52]

A. M. Samarov, Exploring regression structure using nonparametric functional estimation, Journal of the American Statistical Association, 88 (1993), 836-847.  doi: 10.1080/01621459.1993.10476348.  Google Scholar

[53]

T. Sarlos, Improved approximation algorithms for large matrices via random projections, in Annual IEEE Symposium on Foundations of Computer Science (FOCS'06), IEEE, 2006,143–152. doi: 10.1109/FOCS.2006.37.  Google Scholar

[54]

C. J. Stone, Additive regression and other nonparametric models, The Annals of Statistics, 13 (1985), 689-705.  doi: 10.1214/aos/1176349548.  Google Scholar

[55]

J. F. Traub and H. Wozniakowski, A General Theory of Optimal Algorithms, Technical report, Academic Press New York, 1980.  Google Scholar

[56]

R. TripathyI. Bilionis and M. Gonzalez, Gaussian processes with built-in dimensionality reduction: Applications to high-dimensional uncertainty propagation, Journal of Computational Physics, 321 (2016), 191-223.  doi: 10.1016/j.jcp.2016.05.039.  Google Scholar

[57]

H. Tyagi and V. Cevher, Learning non-parametric basis independent models from point queries via low-rank methods, Applied and Computational Harmonic Analysis, 37 (2014), 389-412.  doi: 10.1016/j.acha.2014.01.002.  Google Scholar

[58]

R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, Compressed Sensing, Cambridge Univ. Press, Cambridge, 2012, 210–268, arXiv: 1011.3027.  Google Scholar

[59]

F. Vivarelli and C. K. I. Williams, Discovering hidden features with Gaussian processes regression, Advances in Neural Information Processing Systems, 613–619. Google Scholar

[60]

P. Wedin, Perturbation bounds in connection with singular value decomposition, BIT Numerical Mathematics, 12 (1972), 99-111.  doi: 10.1007/bf01932678.  Google Scholar

[61]

H. Wendland, Scattered Data Approximation, vol. 17, Cambridge University Press, Cambridge, 2005.  Google Scholar

[62]

Y. XiaH. TongW. K. Li and L. X. Zhu, An adaptive estimation of dimension reduction space, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64 (2002), 363-410.  doi: 10.1111/1467-9868.03411.  Google Scholar

[63]

X. Yin and B. Li, Sufficient dimension reduction based on an ensemble of minimum average variance estimators, The Annals of Statistics, 39 (2011), 3392-3416.  doi: 10.1214/11-AOS950.  Google Scholar

[64]

O. Zahm, P. Constantine, C. Prieur and Y. Marzouk, Gradient-based dimension reduction of multivariate vector-valued functions, arXiv preprint, arXiv: 1801.07922. Google Scholar

show all references

References:
[1]

R. Adamczak, Logarithmic Sobolev inequalities and concentration of measure for convex functions and polynomial chaoses, Bull. Pol. Acad. Sci. Math., 53 (2005), 221-238.  doi: 10.4064/ba53-2-10.  Google Scholar

[2]

F. P. Anaraki and S. Hughes, Memory and computation efficient PCA via very sparse random projections, in Proceedings of the International Conference on Machine Learning (ICML-14), 2014, 1341–1349. Google Scholar

[3]

M. Azizyan, A. Krishnamurthy and A. Singh, Extreme compressive sampling for covariance estimation, IEEE Trans. Inform. Theory, 64 (2018), 7613–7635, arXiv: 1506.00898. doi: 10.1109/TIT.2018.2871077.  Google Scholar

[4]

D. S. Bernstein, Matrix Mathematics: Theory, Facts, and Formulas (Second Edition), Princeton reference, Princeton University Press, 2009, https://books.google.co.uk/books?id=x7isojLkDTcC. doi: 10.1515/9781400833344.  Google Scholar

[5]

I. Bogunovic, V. Cevher, J. Haupt and J. Scarlett, Active learning of self-concordant like multi-index functions, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2015, 2189–2193. doi: 10.1109/ICASSP.2015.7178359.  Google Scholar

[6]

T. T. Cai and A. Zhang, ROP: Matrix recovery via rank-one projections, The Annals of Statistics, 43 (2015), 102-138.  doi: 10.1214/14-AOS1267.  Google Scholar

[7]

E. J. Candes, Ridgelets: Theory and applications, PhD thesis, Stanford University, 1998.  Google Scholar

[8]

Y. ChenY. Chi and A. J. Goldsmith, Exact and stable covariance estimation from quadratic sampling via convex programming, IEEE Transactions on Information Theory, 61 (2015), 4034-4059.  doi: 10.1109/TIT.2015.2429594.  Google Scholar

[9]

A. CohenI. DaubechiesR. DeVoreG. Kerkyacharian and D. Picard, Capturing ridge functions in high dimensions from point queries, Constructive Approximation, 35 (2012), 225-243.  doi: 10.1007/s00365-011-9147-6.  Google Scholar

[10]

P. Constantine and D. Gleich, Computing active subspaces with Monte Carlo, arXiv preprint, arXiv: 1408.0545. Google Scholar

[11]

P. G. Constantine, Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies, SIAM, Philadelphia, 2015, https://books.google.com/books?id=TOJ9BwAAQBAJ.  Google Scholar

[12]

P. G. Constantine, A. Eftekhari and R. Ward, A near-stationary subspace for ridge approximation, Computer Methods in Applied Mechanics and Engineering, 326 (2017), 402–421, arXiv: 1606.01929. doi: 10.1016/j.cma.2017.07.038.  Google Scholar

[13]

R. D. Cook, Using dimension-reduction subspaces to identify important inputs in models of physical systems, in Proceedings of the section on Physical and Engineering Sciences, American Statistical Association Alexandria, VA, 1994, 18–25. Google Scholar

[14]

G. DasarathyP. ShahB. N. Bhaskar and R. D. Nowak, Sketching sparse matrices, covariances, and graphs via tensor products, IEEE Transactions on Information Theory, 61 (2015), 1373-1388.  doi: 10.1109/TIT.2015.2391251.  Google Scholar

[15]

R. DeVoreG. Petrova and P. Wojtaszczyk, Approximation of functions of few variables in high dimensions, Constructive Approximation, 33 (2011), 125-143.  doi: 10.1007/s00365-010-9105-8.  Google Scholar

[16]

D. L. Donoho and I. M. Johnstone, Projection-based approximation and a duality with kernel methods, The Annals of Statistics, 17 (1989), 58-106.  doi: 10.1214/aos/1176347004.  Google Scholar

[17]

A. Eftekhari, L. Balzano and M. B. Wakin, What to expect when you are expecting on the Grassmannian, IEEE Signal Processing Letters, 24 (2017), 872–876, arXiv: 1611.07216. doi: 10.1109/LSP.2017.2684784.  Google Scholar

[18]

A. Eftekhari, M. B. Wakin and R. A. Ward, MC$^2$: A two-phase algorithm for leveraged matrix completion, Inf. Inference, 7 (2018), 581–604, arXiv: 1609.01795. doi: 10.1093/imaiai/iax020.  Google Scholar

[19]

M. FornasierK. Schnass and J. Vybiral, Learning functions of few arbitrary linear parameters in high dimensions, Foundations of Computational Mathematics, 12 (2012), 229-262.  doi: 10.1007/s10208-012-9115-y.  Google Scholar

[20]

J. Friedman, T. Hastie and R. Tibshirani, The Elements of Statistical Learning, vol. 1, Springer series in statistics Springer, Berlin, 2001. doi: 10.1007/978-0-387-21606-5.  Google Scholar

[21]

J. FriedmanT. Hastie and R. Tibshirani, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9 (2008), 432-441.  doi: 10.1093/biostatistics/kxm045.  Google Scholar

[22]

J. H. Friedman and W. Stuetzle, Projection pursuit regression, Journal of the American statistical Association, 76 (1981), 817-823.  doi: 10.1080/01621459.1981.10477729.  Google Scholar

[23]

K. FukumizuF. R. Bach and M. I. Jordan, Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces, The Journal of Machine Learning Research, 5 (2004), 73-99.   Google Scholar

[24]

S. Gaiffas and G. Lecue, Optimal rates and adaptation in the single-index model using aggregation, Electronic Journal of Statistics, 1 (2007), 538-573.  doi: 10.1214/07-EJS077.  Google Scholar

[25]

A. T. Glaws, P. G. Constantine and R. D. Cook, Inverse regression for ridge recovery, arXiv preprint, arXiv: 1702.02227. Google Scholar

[26]

G. K. Golubev, Asymptotic minimax estimation of regression in the additive model, Problemy Peredachi Informatsii, 28 (1992), 3-15.   Google Scholar

[27]

A. Gonen, D. Rosenbaum, Y. Eldar and S. Shalev-Shwartz, The sample complexity of subspace learning with partial information, J. Mach. Learn. Res., 17 (2016), Paper No. 52, 21 pp. arXiv: 1402.4844.  Google Scholar

[28]

D. Gross, Recovering low-rank matrices from few coefficients in any basis, IEEE Transactions on Information Theory, 57 (2011), 1548-1566.  doi: 10.1109/TIT.2011.2104999.  Google Scholar

[29]

N. HalkoP. G. Martinsson and J. A. Tropp, Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions, SIAM Review, 53 (2011), 217-288.  doi: 10.1137/090771806.  Google Scholar

[30]

T. J. Hastie and R. J. Tibshirani, Generalized Additive Models, Chapman and Hall/CRC Monographs on Statistics and Applied Probability, Taylor and Francis, 1990, https://books.google.com/books?id=qa29r1Ze1coC.  Google Scholar

[31]

J. HauptR. M. Castro and R. Nowak, Distilled sensing: Adaptive sampling for sparse detection and estimation, IEEE Transactions on Information Theory, 57 (2011), 6222-6235.  doi: 10.1109/TIT.2011.2162269.  Google Scholar

[32]

M. HristacheA. JuditskyJ. Polzehl and V. Spokoiny, Structure adaptive approach for dimension reduction, The Annals of Statistics, 29 (2001), 1537-1566.  doi: 10.1214/aos/1015345954.  Google Scholar

[33]

P. J. Huber, Projection pursuit, The annals of Statistics, 13 (1985), 435-475.  doi: 10.1214/aos/1176349519.  Google Scholar

[34]

A. B. JuditskyO. V. Lepski and A. B. Tsybakov, Nonparametric estimation of composite functions, The Annals of Statistics, 37 (2009), 1360-1404.  doi: 10.1214/08-AOS611.  Google Scholar

[35]

S. Keiper, Analysis of generalized ridge functions in high dimensions, in International Conference on Sampling Theory and Applications (SampTA), IEEE, 2015,259–263. doi: 10.1109/SAMPTA.2015.7148892.  Google Scholar

[36]

M. Kolar and E. P. Xing, Consistent covariance selection from data with missing values, in Proceedings of the International Conference on Machine Learning (ICML-12), 2012,551–558. Google Scholar

[37]

A. Krishnamurthy, M. Azizyan and A. Singh, Subspace learning from extremely compressed measurements, arXiv preprint, arXiv: 1404.0751. Google Scholar

[38]

R. Lam, O. Zahm, Y. Marzouk and K. Willcox, Multifidelity dimension reduction via active subspaces, arXiv preprint, arXiv: 1809.05567. Google Scholar

[39]

K. C. Li, Sliced inverse regression for dimension reduction, Journal of the American Statistical Association, 86 (1991), 316-327.  doi: 10.1080/01621459.1991.10475035.  Google Scholar

[40]

E. LibertyF. WoolfeP. G. MartinssonV. Rokhlin and M. Tygert, Randomized algorithms for the low-rank approximation of matrices, Proceedings of the National Academy of Sciences, 104 (2007), 20167-20172.  doi: 10.1073/pnas.0709640104.  Google Scholar

[41]

P. L. Loh and M. J. Wainwright, High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity, Ann. Statist., 40 (2012), 1637-1664.  doi: 10.1214/12-AOS1018.  Google Scholar

[42]

K. Lounici, High-dimensional covariance matrix estimation with missing observations, Bernoulli, 20 (2014), 1029-1058.  doi: 10.3150/12-BEJ487.  Google Scholar

[43]

E. Novak and H. Woźniakowski, Tractability of Multivariate Problems: Standard information for functionals, vol. 12, European Mathematical Society, 2010. doi: 10.4171/084.  Google Scholar

[44]

F. W. J. Olver, NIST Handbook of Mathematical Functions, Cambridge University Press, 2010. Google Scholar

[45]

A. Pinkus, Approximation theory of the MLP model in neural networks, Acta Numerica, 8 (1999), 143-195.  doi: 10.1017/S0962492900002919.  Google Scholar

[46]

A. Pinkus, Ridge Functions, Cambridge Tracts in Mathematics, Cambridge University Press, 2015, https://books.google.com/books?id=dtMmCgAAQBAJ. doi: 10.1017/CBO9781316408124.  Google Scholar

[47]

F. Pourkamali-Anaraki, Estimation of the sample covariance matrix from compressive measurements, IET Signal Processing, 10 (2016), p1089, arXiv: 1512.08887. doi: 10.1049/iet-spr.2016.0169.  Google Scholar

[48]

C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, The MIT Press, 2006.  Google Scholar

[49]

P. RavikumarM. J. WainwrightG. Raskutti and B. Yu, High-dimensional covariance estimation by minimizing $l_1$-penalized log-determinant divergence, Electronic Journal of Statistics, 5 (2011), 935-980.  doi: 10.1214/11-EJS631.  Google Scholar

[50]

B. Recht, A simpler approach to matrix completion, The Journal of Machine Learning Research, 12 (2011), 3413-3430.   Google Scholar

[51]

B. RechtM. Fazel and P. A. Parrilo, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Review, 52 (2010), 471-501.  doi: 10.1137/070697835.  Google Scholar

[52]

A. M. Samarov, Exploring regression structure using nonparametric functional estimation, Journal of the American Statistical Association, 88 (1993), 836-847.  doi: 10.1080/01621459.1993.10476348.  Google Scholar

[53]

T. Sarlos, Improved approximation algorithms for large matrices via random projections, in Annual IEEE Symposium on Foundations of Computer Science (FOCS'06), IEEE, 2006,143–152. doi: 10.1109/FOCS.2006.37.  Google Scholar

[54]

C. J. Stone, Additive regression and other nonparametric models, The Annals of Statistics, 13 (1985), 689-705.  doi: 10.1214/aos/1176349548.  Google Scholar

[55]

J. F. Traub and H. Wozniakowski, A General Theory of Optimal Algorithms, Technical report, Academic Press New York, 1980.  Google Scholar

[56]

R. TripathyI. Bilionis and M. Gonzalez, Gaussian processes with built-in dimensionality reduction: Applications to high-dimensional uncertainty propagation, Journal of Computational Physics, 321 (2016), 191-223.  doi: 10.1016/j.jcp.2016.05.039.  Google Scholar

[57]

H. Tyagi and V. Cevher, Learning non-parametric basis independent models from point queries via low-rank methods, Applied and Computational Harmonic Analysis, 37 (2014), 389-412.  doi: 10.1016/j.acha.2014.01.002.  Google Scholar

[58]

R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, Compressed Sensing, Cambridge Univ. Press, Cambridge, 2012, 210–268, arXiv: 1011.3027.  Google Scholar

[59]

F. Vivarelli and C. K. I. Williams, Discovering hidden features with Gaussian processes regression, Advances in Neural Information Processing Systems, 613–619. Google Scholar

[60]

P. Wedin, Perturbation bounds in connection with singular value decomposition, BIT Numerical Mathematics, 12 (1972), 99-111.  doi: 10.1007/bf01932678.  Google Scholar

[61]

H. Wendland, Scattered Data Approximation, vol. 17, Cambridge University Press, Cambridge, 2005.  Google Scholar

[62]

Y. XiaH. TongW. K. Li and L. X. Zhu, An adaptive estimation of dimension reduction space, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64 (2002), 363-410.  doi: 10.1111/1467-9868.03411.  Google Scholar

[63]

X. Yin and B. Li, Sufficient dimension reduction based on an ensemble of minimum average variance estimators, The Annals of Statistics, 39 (2011), 3392-3416.  doi: 10.1214/11-AOS950.  Google Scholar

[64]

O. Zahm, P. Constantine, C. Prieur and Y. Marzouk, Gradient-based dimension reduction of multivariate vector-valued functions, arXiv preprint, arXiv: 1801.07922. Google Scholar

Figure 1.  Visualization of the problem setup. The probability measure $ \mu $ is supported on the domain $ \mathbb{D}\subseteq \mathbb{R}^n $ of a smooth function $ f(\cdot) $. Here, $ N = 3 $ and $ X = \{x_i\}_{i = 1}^N $ are drawn independently from $ \mu $. For sufficiently small $ \epsilon $, we let $ \mathbb{B}_{x_i, \epsilon} $ denote the $ \epsilon $-neighborhood of each $ x_i $ and set $ \mathbb{B}_{X, \epsilon} = \cup_{i = 1}^N \mathbb{B}_{x_i, \epsilon} $. On $ \mathbb{B}_{X, \epsilon} $, $ \mu $ induces the conditional measure $ \mu_{X, \epsilon} $, from which $ N_{X, \epsilon} $ points are independently drawn and collected in $ Y_{X, \epsilon} = \{y_{ij}\}_{i, j} $. Here, $ x_1 $ has $ N_{x_1, \epsilon} = 2 $ neighbors in $ Y_{X, \epsilon} $ and we set $ Y_{x_1, \epsilon} = \{y_{1, j} \}_{j = 1}^{N_{x_1, \epsilon}} $. Similarly, $ Y_{x_2, \epsilon} $ and $ Y_{x_3, \epsilon} $ are formed. Note that $ Y_{X, \epsilon} = \cup_{i = 1}^N Y_{x_i, \epsilon} $. Our objective is to estimate the second-moment matrix of $ f(\cdot) $ (with respect to the probability measure $ \mu $) given $ \{x_i, f(x_i)\} $ and $ \{y_{ij}, f(y_{ij})\} $
Figure 2.  A simulation study using a 500-dimensional ($ n = 500 $) quadratic function for the relative error in the approximation (7) as a function of the number $ N_{X, \epsilon} $ of total samples. All simulations use $ \epsilon = 10^{-4} $. Each subfigure uses a different value for the minimum number $ N_{X, \text{min}, \epsilon} $ of points in the $ \epsilon $-neighborhood of each center point in $ X $, and varies the number $ N $ of centers. Each of the five groups of blue dots in each subfigure indicates a different number $ N $. Within each group, there are ten replications. The slope of each line is $ -1/2 $, which verifies the expected convergence rate from (6)
[1]

Zhihua Zhang, Naoki Saito. PHLST with adaptive tiling and its application to antarctic remote sensing image approximation. Inverse Problems & Imaging, 2014, 8 (1) : 321-337. doi: 10.3934/ipi.2014.8.321

[2]

Luke Finlay, Vladimir Gaitsgory, Ivan Lebedev. Linear programming solutions of periodic optimization problems: approximation of the optimal control. Journal of Industrial & Management Optimization, 2007, 3 (2) : 399-413. doi: 10.3934/jimo.2007.3.399

[3]

Xianming Liu, Guangyue Han. A Wong-Zakai approximation of stochastic differential equations driven by a general semimartingale. Discrete & Continuous Dynamical Systems - B, 2021, 26 (5) : 2499-2508. doi: 10.3934/dcdsb.2020192

[4]

Alberto Bressan, Ke Han, Franco Rampazzo. On the control of non holonomic systems by active constraints. Discrete & Continuous Dynamical Systems - A, 2013, 33 (8) : 3329-3353. doi: 10.3934/dcds.2013.33.3329

[5]

Tao Wu, Yu Lei, Jiao Shi, Maoguo Gong. An evolutionary multiobjective method for low-rank and sparse matrix decomposition. Big Data & Information Analytics, 2017, 2 (1) : 23-37. doi: 10.3934/bdia.2017006

[6]

Guillaume Bal, Wenjia Jing. Homogenization and corrector theory for linear transport in random media. Discrete & Continuous Dynamical Systems - A, 2010, 28 (4) : 1311-1343. doi: 10.3934/dcds.2010.28.1311

[7]

Felix Finster, Jürg Fröhlich, Marco Oppio, Claudio F. Paganini. Causal fermion systems and the ETH approach to quantum theory. Discrete & Continuous Dynamical Systems - S, 2021, 14 (5) : 1717-1746. doi: 10.3934/dcdss.2020451

[8]

Anton Schiela, Julian Ortiz. Second order directional shape derivatives of integrals on submanifolds. Mathematical Control & Related Fields, 2021  doi: 10.3934/mcrf.2021017

[9]

Teddy Pichard. A moment closure based on a projection on the boundary of the realizability domain: 1D case. Kinetic & Related Models, 2020, 13 (6) : 1243-1280. doi: 10.3934/krm.2020045

[10]

W. Cary Huffman. On the theory of $\mathbb{F}_q$-linear $\mathbb{F}_{q^t}$-codes. Advances in Mathematics of Communications, 2013, 7 (3) : 349-378. doi: 10.3934/amc.2013.7.349

[11]

Qian Liu. The lower bounds on the second-order nonlinearity of three classes of Boolean functions. Advances in Mathematics of Communications, 2021  doi: 10.3934/amc.2020136

[12]

Charles Fulton, David Pearson, Steven Pruess. Characterization of the spectral density function for a one-sided tridiagonal Jacobi matrix operator. Conference Publications, 2013, 2013 (special) : 247-257. doi: 10.3934/proc.2013.2013.247

[13]

Xiaoming Wang. Quasi-periodic solutions for a class of second order differential equations with a nonlinear damping term. Discrete & Continuous Dynamical Systems - S, 2017, 10 (3) : 543-556. doi: 10.3934/dcdss.2017027

[14]

Kaifang Liu, Lunji Song, Shan Zhao. A new over-penalized weak galerkin method. Part Ⅰ: Second-order elliptic problems. Discrete & Continuous Dynamical Systems - B, 2021, 26 (5) : 2411-2428. doi: 10.3934/dcdsb.2020184

[15]

John Leventides, Costas Poulios, Georgios Alkis Tsiatsios, Maria Livada, Stavros Tsipras, Konstantinos Lefcaditis, Panagiota Sargenti, Aleka Sargenti. Systems theory and analysis of the implementation of non pharmaceutical policies for the mitigation of the COVID-19 pandemic. Journal of Dynamics & Games, 2021  doi: 10.3934/jdg.2021004

 Impact Factor: 

Article outline

Figures and Tables

[Back to Top]