# American Institute of Mathematical Sciences

doi: 10.3934/ipi.2021039
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

## Overcomplete representation in a hierarchical Bayesian framework

 1 University of Bologna, Department of Mathematics, Piazza di Porta San Donato 5, Bologna, Italy 2 Case Western Reserve University, Department of Mathematics, Applied Mathematics and Statistics, 10900 Euclid Avenue, Cleveland, OH 4410, USA

*Corresponding author: Monica Pragliola

Received  September 2020 Revised  April 2021 Early access May 2021

Fund Project: MP acknowledges the National Group for Scientific Computation (GNCS-INDAM). The work of DC was partly supported by NSF grants DMS-1522334 and DMS-1951446, and the work of ES was partly supported by NSF grant DMS-1714617

A common task in inverse problems and imaging is finding a solution that is sparse, in the sense that most of its components vanish. In the framework of compressed sensing, general results guaranteeing exact recovery have been proven. In practice, sparse solutions are often computed combining $\ell_1$-penalized least squares optimization with an appropriate numerical scheme to accomplish the task. A computationally efficient alternative for finding sparse solutions to linear inverse problems is provided by Bayesian hierarchical models, in which the sparsity is encoded by defining a conditionally Gaussian prior model with the prior parameter obeying a generalized gamma distribution. An iterative alternating sequential (IAS) algorithm has been demonstrated to lead to a computationally efficient scheme, and combined with Krylov subspace iterations with an early termination condition, the approach is particularly well suited for large scale problems. Here the Bayesian approach to sparsity is extended to problems whose solution allows a sparse coding in an overcomplete system such as composite frames. It is shown that among the multiple possible representations of the unknown, the IAS algorithm, and in particular, a hybrid version of it, is effectively identifying the most sparse solution. Computed examples show that the method is particularly well suited not only for traditional imaging applications but also for dictionary learning problems in the framework of machine learning.

Citation: Monica Pragliola, Daniela Calvetti, Erkki Somersalo. Overcomplete representation in a hierarchical Bayesian framework. Inverse Problems & Imaging, doi: 10.3934/ipi.2021039
##### References:
 [1] J. M. Bardsley, D. Calvetti and E. Somersalo, Hierarchical regularization for edge-preserving reconstruction of PET images, Inverse Problems, 26 (2010), 035010. doi: 10.1088/0266-5611/26/3/035010.  Google Scholar [2] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning, 3 (2011), 1-122.  doi: 10.1561/2200000016.  Google Scholar [3] A. M. Bruckstein, D. L. Donoho and M. Elad, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Review, 51 (2009), 34-81.  doi: 10.1137/060657704.  Google Scholar [4] D. Calvetti, H. Hakula, S. Pursiainen and E. Somersalo, Conditionally Gaussian hypermodels for cerebral source localization, SIAM Journal on Imaging Sciences, 2 (2009), 879-909.  doi: 10.1137/080723995.  Google Scholar [5] D. Calvetti, F. Pitolli, J. Prezioso, E. Somersalo and B. Vantaggi, Priorconditioned CGLS-based quasi-MAP estimate, statistical stopping rule, and ranking of priors, SIAM Journal of Scientific Computing, 39 (2017), S477–S500. doi: 10.1137/16M108272X.  Google Scholar [6] D. Calvetti, A. Pascarella, F. Pitolli, E. Somersalo and B. Vantaggi, Brain activity mapping from MEG data via a hierarchical Bayesian algorithm with automatic depth weighting, Brain Topography, 32 (2019), 363-393.  doi: 10.1007/s10548-018-0670-7.  Google Scholar [7] D. Calvetti, F. Pitolli, E. Somersalo and B. Vantaggi, Bayes meets Krylov: Statistically inspired preconditioners for CGLS, SIAM Review, 60 (2018), 429-461.  doi: 10.1137/15M1055061.  Google Scholar [8] D. Calvetti, M. Pragliola and E. Somersalo, Sparsity promoting hybrid solvers for hierarchical Bayesian inverse problems, SIAM Journal on Scientific Computing 42 (2020), A3761–A3784. doi: 10.1137/20M1326246.  Google Scholar [9] D. Calvetti, M. Pragliola, E. Somersalo and A. Strang, Sparse reconstructions from few noisy data: Analysis of hierarchical Bayesian models with generalized gamma hyperpriors, Inverse Problems, 36 (2020), 025010. doi: 10.1088/1361-6420/ab4d92.  Google Scholar [10] D. Calvetti, E. Somersalo and A. Strang, Hierachical Bayesian models and sparsity: $\ell_2$-magic, Inverse Problems, 35 (2019), 035003. doi: 10.1088/1361-6420/aaf5ab.  Google Scholar [11] E. J. Candes, J. Romberg and T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, 52 (2006), 489-509.  doi: 10.1109/TIT.2005.862083.  Google Scholar [12] E. J. Candes, Y. C. Eldar, D. Needell and P. Randall, Compressed sensing with coherent and redundant dictionaries, Applied and Computational Harmonic Analysis, 31 (2011), 59-73.  doi: 10.1016/j.acha.2010.10.002.  Google Scholar [13] A. Chambolle, M. Holler and T. Pock, A convex variational model for learning convolutional image atoms from incomplete data, Journal of Mathematical Imaging and Vision, 62 (2020), 417-444.  doi: 10.1007/s10851-019-00919-7.  Google Scholar [14] G. Chen and D. Needell, Compressed sensing and dictionary learning, Finite Frame Theory, Proceedings of Symposia in Applied Mathematics, Vol. 73, Providence, RI, (2016), 201–241.  Google Scholar [15] S. S. Chen, D. L. Donoho and and M. A. Saunders, Atomic decomposition by basis pursuit, SIAM Journal on Scientific Computing, 20 (1998), 33-61.  doi: 10.1137/S1064827596304010.  Google Scholar [16] D. Ma, V. Gulani, N. Seiberlich, K. Liu, J. L. Sunshine, J. L. Duerk and M. A. Griswold, Magnetic resonance fingerprinting, Nature, 495 (2013), 187-192.  doi: 10.1038/nature11971.  Google Scholar [17] S. G. Mallat and Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, 41 (1993), 3397-3415.   Google Scholar [18] R. Rubinstein, A. M. Bruckstein and M. Elad, Dictionaries for sparse representation modeling, Proceedings of the IEEE, 98 (2010), 1045-1057.  doi: 10.1109/JPROC.2010.2040551.  Google Scholar [19] J. Starck, J. Fadili and F. J. Murtagh, The undecimated wavelet decomposition and its reconstruction, IEEE Transactions on Image Processing, 16 (2007), 297-309.  doi: 10.1109/TIP.2006.887733.  Google Scholar [20] J. L. Starck, M. Elad and D. Donoho, Redundant multiscale transforms and their application for morphological component separation, Advances in Imaging and Electron Physics, 132 (2004), 287-348.  doi: 10.1016/S1076-5670(04)32006-9.  Google Scholar [21] A. F. Vidal, V. De Bortoli, M. Pereyra and A. Durmus, Maximum likelihood estimation of regularization parameters in high-dimensional inverse problems: An empirical Bayesian approach Part Ⅰ: Methodology and experiments, SIAM Journal on Imaging Sciences, 13 (2020), 1945-1989.  doi: 10.1137/20M1339829.  Google Scholar

show all references

##### References:
 [1] J. M. Bardsley, D. Calvetti and E. Somersalo, Hierarchical regularization for edge-preserving reconstruction of PET images, Inverse Problems, 26 (2010), 035010. doi: 10.1088/0266-5611/26/3/035010.  Google Scholar [2] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning, 3 (2011), 1-122.  doi: 10.1561/2200000016.  Google Scholar [3] A. M. Bruckstein, D. L. Donoho and M. Elad, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Review, 51 (2009), 34-81.  doi: 10.1137/060657704.  Google Scholar [4] D. Calvetti, H. Hakula, S. Pursiainen and E. Somersalo, Conditionally Gaussian hypermodels for cerebral source localization, SIAM Journal on Imaging Sciences, 2 (2009), 879-909.  doi: 10.1137/080723995.  Google Scholar [5] D. Calvetti, F. Pitolli, J. Prezioso, E. Somersalo and B. Vantaggi, Priorconditioned CGLS-based quasi-MAP estimate, statistical stopping rule, and ranking of priors, SIAM Journal of Scientific Computing, 39 (2017), S477–S500. doi: 10.1137/16M108272X.  Google Scholar [6] D. Calvetti, A. Pascarella, F. Pitolli, E. Somersalo and B. Vantaggi, Brain activity mapping from MEG data via a hierarchical Bayesian algorithm with automatic depth weighting, Brain Topography, 32 (2019), 363-393.  doi: 10.1007/s10548-018-0670-7.  Google Scholar [7] D. Calvetti, F. Pitolli, E. Somersalo and B. Vantaggi, Bayes meets Krylov: Statistically inspired preconditioners for CGLS, SIAM Review, 60 (2018), 429-461.  doi: 10.1137/15M1055061.  Google Scholar [8] D. Calvetti, M. Pragliola and E. Somersalo, Sparsity promoting hybrid solvers for hierarchical Bayesian inverse problems, SIAM Journal on Scientific Computing 42 (2020), A3761–A3784. doi: 10.1137/20M1326246.  Google Scholar [9] D. Calvetti, M. Pragliola, E. Somersalo and A. Strang, Sparse reconstructions from few noisy data: Analysis of hierarchical Bayesian models with generalized gamma hyperpriors, Inverse Problems, 36 (2020), 025010. doi: 10.1088/1361-6420/ab4d92.  Google Scholar [10] D. Calvetti, E. Somersalo and A. Strang, Hierachical Bayesian models and sparsity: $\ell_2$-magic, Inverse Problems, 35 (2019), 035003. doi: 10.1088/1361-6420/aaf5ab.  Google Scholar [11] E. J. Candes, J. Romberg and T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, 52 (2006), 489-509.  doi: 10.1109/TIT.2005.862083.  Google Scholar [12] E. J. Candes, Y. C. Eldar, D. Needell and P. Randall, Compressed sensing with coherent and redundant dictionaries, Applied and Computational Harmonic Analysis, 31 (2011), 59-73.  doi: 10.1016/j.acha.2010.10.002.  Google Scholar [13] A. Chambolle, M. Holler and T. Pock, A convex variational model for learning convolutional image atoms from incomplete data, Journal of Mathematical Imaging and Vision, 62 (2020), 417-444.  doi: 10.1007/s10851-019-00919-7.  Google Scholar [14] G. Chen and D. Needell, Compressed sensing and dictionary learning, Finite Frame Theory, Proceedings of Symposia in Applied Mathematics, Vol. 73, Providence, RI, (2016), 201–241.  Google Scholar [15] S. S. Chen, D. L. Donoho and and M. A. Saunders, Atomic decomposition by basis pursuit, SIAM Journal on Scientific Computing, 20 (1998), 33-61.  doi: 10.1137/S1064827596304010.  Google Scholar [16] D. Ma, V. Gulani, N. Seiberlich, K. Liu, J. L. Sunshine, J. L. Duerk and M. A. Griswold, Magnetic resonance fingerprinting, Nature, 495 (2013), 187-192.  doi: 10.1038/nature11971.  Google Scholar [17] S. G. Mallat and Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, 41 (1993), 3397-3415.   Google Scholar [18] R. Rubinstein, A. M. Bruckstein and M. Elad, Dictionaries for sparse representation modeling, Proceedings of the IEEE, 98 (2010), 1045-1057.  doi: 10.1109/JPROC.2010.2040551.  Google Scholar [19] J. Starck, J. Fadili and F. J. Murtagh, The undecimated wavelet decomposition and its reconstruction, IEEE Transactions on Image Processing, 16 (2007), 297-309.  doi: 10.1109/TIP.2006.887733.  Google Scholar [20] J. L. Starck, M. Elad and D. Donoho, Redundant multiscale transforms and their application for morphological component separation, Advances in Imaging and Electron Physics, 132 (2004), 287-348.  doi: 10.1016/S1076-5670(04)32006-9.  Google Scholar [21] A. F. Vidal, V. De Bortoli, M. Pereyra and A. Durmus, Maximum likelihood estimation of regularization parameters in high-dimensional inverse problems: An empirical Bayesian approach Part Ⅰ: Methodology and experiments, SIAM Journal on Imaging Sciences, 13 (2020), 1945-1989.  doi: 10.1137/20M1339829.  Google Scholar
The generative model (left) and the blurred and noisy data vector $b \in {\mathbb R}^{46}$ (right)
Reconstruction of the signal $x$ (left) and the count of CGLS steps per outer iteration of the global hybrid IAS (right)
Vector $\alpha_i$ (left panels), corresponding scaled variances (middle panels) and contribution of the signal ${\mathsf W}_i \alpha_i$ (right panels) for $i = 1$, i.e. representation in terms of increments, (top row) and $i = 2$, i.e. representation in terms of cosine transform, (bottom row)
Original image (left), observed data (middle) and reconstructed image (right)
Vector $\alpha_i$ (left panels), base-10 logarithmic plot of the corresponding scaled variances (middle panels) and vector ${\mathsf W}_i\alpha_i$ contributing to the final restoration (right panels) for $i = 1$, i.e. vertical increments representation, (top row), $i = 2$, i.e. horizontal increments representation, (bottom row)
Top row: original $512\times512$ test image (left), observed data (middle) and denoised image (right). Bottom row: respective close-up(s)
Vector $\alpha_i$ (left panels), base-10 logarithmic plot of the corresponding scaled variances (middle panels), and vector $\mathsf{W}_i\alpha_i$ (right panels) for $i = 1$, i.e. vertical increments representation, (top row) and $i = 2$, horizontal increments representation, (bottom row)
Value distributions of the coefficients corresponding to vertical (left) and horizontal (right) increments in logarithmic scale. The blue distributions correspond to the original image which does not allow sparse representation in the basis, which is reflected in the high percentage of non-vanishing coefficients, while the red distribution is the sparse denoised reconstruction, in which the percentage of coefficients above the negligible threshold value is reduced by an order of magnitude
Original image (left), observed data (middle) and reconstructed image (right)
Representation vector $\alpha_i$ (left panels), base-10 logarithmic plot of the corresponding scaled variances (middle panels) and vector ${\mathsf W}_i\alpha_i$ contributing to the final restoration (right panels) for $i = 1$ (first row), $i = 2$ (second row), $i = 3$ (third row) and $i = 4$ (fourth row)
A schematic representation of the dictionary learning example. The digit on the right is the non-annotated image $b$, which is approximated in terms of the annotated atoms $w_i$ on the right. The coefficients $\alpha_i$ can then be used to identify the digit
Dictionary learning results. The first row shows the test images of the digits to be classified by the dictionary learning algorithm (vector $b$), the true annotation indicated in the figure, the second row the vectors $\theta$ after the IAS iteration with the first hyperprior, and the third row after the iteration with the second hyperprior. The fourth row represents the synthesis ${\mathsf W}\alpha$ approximating the original digit, and finally, the fifth row gives the histogram of the annotations of the atoms corresponding to coefficients above a threshold $\tau = 0.01$. The annotation is done by majority vote, choosing the largest of the bins. In this example, the standard deviation of the noise representing the mismatch was $\sigma = 0.01$
The rows are as in Figure 12. In this example, the standard deviation of the noise representing the mismatch was $\sigma = 0.05$. Observe that the approximation becomes sparser
The rows are as in Figure 12. In this example, the standard deviation of the noise representing the mismatch was $\sigma = 0.1$. The increased sparsity here is traded with an increased number of misclassifications, such as in the first and the third columns
 [1] Abdeslem Hafid Bentbib, Smahane El-Halouy, El Mostafa Sadek. Extended Krylov subspace methods for solving Sylvester and Stein tensor equations. Discrete & Continuous Dynamical Systems - S, 2021  doi: 10.3934/dcdss.2021026 [2] Qiao Liang, Qiang Ye. Deflation by restriction for the inverse-free preconditioned Krylov subspace method. Numerical Algebra, Control & Optimization, 2016, 6 (1) : 55-71. doi: 10.3934/naco.2016.6.55 [3] Radu Balan, Peter G. Casazza, Christopher Heil and Zeph Landau. Density, overcompleteness, and localization of frames. Electronic Research Announcements, 2006, 12: 71-86. [4] Manuel V. C. Vieira. Derivatives of eigenvalues and Jordan frames. Numerical Algebra, Control & Optimization, 2016, 6 (2) : 115-126. doi: 10.3934/naco.2016003 [5] Mikko Orispää, Markku Lehtinen. Fortran linear inverse problem solver. Inverse Problems & Imaging, 2010, 4 (3) : 485-503. doi: 10.3934/ipi.2010.4.485 [6] Antonio Cossidente, Sascha Kurz, Giuseppe Marino, Francesco Pavese. Combining subspace codes. Advances in Mathematics of Communications, 2021  doi: 10.3934/amc.2021007 [7] Michael Kiermaier, Reinhard Laue. Derived and residual subspace designs. Advances in Mathematics of Communications, 2015, 9 (1) : 105-115. doi: 10.3934/amc.2015.9.105 [8] Hanbing Liu, Yongdong Huang, Chongjun Li. Weaving K-fusion frames in hilbert spaces. Mathematical Foundations of Computing, 2020, 3 (2) : 101-116. doi: 10.3934/mfc.2020008 [9] Mike Crampin, Tom Mestdag. Reduction of invariant constrained systems using anholonomic frames. Journal of Geometric Mechanics, 2011, 3 (1) : 23-40. doi: 10.3934/jgm.2011.3.23 [10] Amir Averbuch, Pekka Neittaanmäki, Valery Zheludev. Periodic spline-based frames for image restoration. Inverse Problems & Imaging, 2015, 9 (3) : 661-707. doi: 10.3934/ipi.2015.9.661 [11] Evelyn Herberg, Michael Hinze, Henrik Schumacher. Maximal discrete sparsity in parabolic optimal control with measures. Mathematical Control & Related Fields, 2020, 10 (4) : 735-759. doi: 10.3934/mcrf.2020018 [12] Bruno Sixou, Valentina Davidoiu, Max Langer, Francoise Peyrin. Absorption and phase retrieval with Tikhonov and joint sparsity regularizations. Inverse Problems & Imaging, 2013, 7 (1) : 267-282. doi: 10.3934/ipi.2013.7.267 [13] Liyan Ma, Lionel Moisan, Jian Yu, Tieyong Zeng. A stable method solving the total variation dictionary model with $L^\infty$ constraints. Inverse Problems & Imaging, 2014, 8 (2) : 507-535. doi: 10.3934/ipi.2014.8.507 [14] Yangyang Xu, Wotao Yin. A fast patch-dictionary method for whole image recovery. Inverse Problems & Imaging, 2016, 10 (2) : 563-583. doi: 10.3934/ipi.2016012 [15] Kanghui Guo, Demetrio Labate, Wang-Q Lim, Guido Weiss and Edward Wilson. Wavelets with composite dilations. Electronic Research Announcements, 2004, 10: 78-87. [16] Timo Berthold, Ambros M. Gleixner, Stefan Heinz, Stefan Vigerske. Analyzing the computational impact of MIQCP solver components. Numerical Algebra, Control & Optimization, 2012, 2 (4) : 739-748. doi: 10.3934/naco.2012.2.739 [17] Heide Gluesing-Luerssen, Carolyn Troha. Construction of subspace codes through linkage. Advances in Mathematics of Communications, 2016, 10 (3) : 525-540. doi: 10.3934/amc.2016023 [18] Ernst M. Gabidulin, Pierre Loidreau. Properties of subspace subcodes of Gabidulin codes. Advances in Mathematics of Communications, 2008, 2 (2) : 147-157. doi: 10.3934/amc.2008.2.147 [19] Ruijun Zhao, Yong-Tao Zhang, Shanqin Chen. Krylov implicit integration factor WENO method for SIR model with directed diffusion. Discrete & Continuous Dynamical Systems - B, 2019, 24 (9) : 4983-5001. doi: 10.3934/dcdsb.2019041 [20] Sylvain Sorin, Cheng Wan. Finite composite games: Equilibria and dynamics. Journal of Dynamics & Games, 2016, 3 (1) : 101-120. doi: 10.3934/jdg.2016005

2020 Impact Factor: 1.639

## Tools

Article outline

Figures and Tables