Article Contents
Article Contents

# Overcomplete representation in a hierarchical Bayesian framework

• *Corresponding author: Monica Pragliola

MP acknowledges the National Group for Scientific Computation (GNCS-INDAM). The work of DC was partly supported by NSF grants DMS-1522334 and DMS-1951446, and the work of ES was partly supported by NSF grant DMS-1714617

• A common task in inverse problems and imaging is finding a solution that is sparse, in the sense that most of its components vanish. In the framework of compressed sensing, general results guaranteeing exact recovery have been proven. In practice, sparse solutions are often computed combining $\ell_1$-penalized least squares optimization with an appropriate numerical scheme to accomplish the task. A computationally efficient alternative for finding sparse solutions to linear inverse problems is provided by Bayesian hierarchical models, in which the sparsity is encoded by defining a conditionally Gaussian prior model with the prior parameter obeying a generalized gamma distribution. An iterative alternating sequential (IAS) algorithm has been demonstrated to lead to a computationally efficient scheme, and combined with Krylov subspace iterations with an early termination condition, the approach is particularly well suited for large scale problems. Here the Bayesian approach to sparsity is extended to problems whose solution allows a sparse coding in an overcomplete system such as composite frames. It is shown that among the multiple possible representations of the unknown, the IAS algorithm, and in particular, a hybrid version of it, is effectively identifying the most sparse solution. Computed examples show that the method is particularly well suited not only for traditional imaging applications but also for dictionary learning problems in the framework of machine learning.

Mathematics Subject Classification: Primary: 62F15, 94A08.

 Citation:

• Figure 1.  The generative model (left) and the blurred and noisy data vector $b \in {\mathbb R}^{46}$ (right)

Figure 2.  Reconstruction of the signal $x$ (left) and the count of CGLS steps per outer iteration of the global hybrid IAS (right)

Figure 3.  Vector $\alpha_i$ (left panels), corresponding scaled variances (middle panels) and contribution of the signal ${\mathsf W}_i \alpha_i$ (right panels) for $i = 1$, i.e. representation in terms of increments, (top row) and $i = 2$, i.e. representation in terms of cosine transform, (bottom row)

Figure 4.  Original image (left), observed data (middle) and reconstructed image (right)

Figure 5.  Vector $\alpha_i$ (left panels), base-10 logarithmic plot of the corresponding scaled variances (middle panels) and vector ${\mathsf W}_i\alpha_i$ contributing to the final restoration (right panels) for $i = 1$, i.e. vertical increments representation, (top row), $i = 2$, i.e. horizontal increments representation, (bottom row)

Figure 6.  Top row: original $512\times512$ test image (left), observed data (middle) and denoised image (right). Bottom row: respective close-up(s)

Figure 7.  Vector $\alpha_i$ (left panels), base-10 logarithmic plot of the corresponding scaled variances (middle panels), and vector $\mathsf{W}_i\alpha_i$ (right panels) for $i = 1$, i.e. vertical increments representation, (top row) and $i = 2$, horizontal increments representation, (bottom row)

Figure 8.  Value distributions of the coefficients corresponding to vertical (left) and horizontal (right) increments in logarithmic scale. The blue distributions correspond to the original image which does not allow sparse representation in the basis, which is reflected in the high percentage of non-vanishing coefficients, while the red distribution is the sparse denoised reconstruction, in which the percentage of coefficients above the negligible threshold value is reduced by an order of magnitude

Figure 9.  Original image (left), observed data (middle) and reconstructed image (right)

Figure 10.  Representation vector $\alpha_i$ (left panels), base-10 logarithmic plot of the corresponding scaled variances (middle panels) and vector ${\mathsf W}_i\alpha_i$ contributing to the final restoration (right panels) for $i = 1$ (first row), $i = 2$ (second row), $i = 3$ (third row) and $i = 4$ (fourth row)

Figure 11.  A schematic representation of the dictionary learning example. The digit on the right is the non-annotated image $b$, which is approximated in terms of the annotated atoms $w_i$ on the right. The coefficients $\alpha_i$ can then be used to identify the digit

Figure 12.  Dictionary learning results. The first row shows the test images of the digits to be classified by the dictionary learning algorithm (vector $b$), the true annotation indicated in the figure, the second row the vectors $\theta$ after the IAS iteration with the first hyperprior, and the third row after the iteration with the second hyperprior. The fourth row represents the synthesis ${\mathsf W}\alpha$ approximating the original digit, and finally, the fifth row gives the histogram of the annotations of the atoms corresponding to coefficients above a threshold $\tau = 0.01$. The annotation is done by majority vote, choosing the largest of the bins. In this example, the standard deviation of the noise representing the mismatch was $\sigma = 0.01$

Figure 13.  The rows are as in Figure 12. In this example, the standard deviation of the noise representing the mismatch was $\sigma = 0.05$. Observe that the approximation becomes sparser

Figure 14.  The rows are as in Figure 12. In this example, the standard deviation of the noise representing the mismatch was $\sigma = 0.1$. The increased sparsity here is traded with an increased number of misclassifications, such as in the first and the third columns

•  [1] J. M. Bardsley, D. Calvetti and E. Somersalo, Hierarchical regularization for edge-preserving reconstruction of PET images, Inverse Problems, 26 (2010), 035010. doi: 10.1088/0266-5611/26/3/035010. [2] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning, 3 (2011), 1-122.  doi: 10.1561/2200000016. [3] A. M. Bruckstein, D. L. Donoho and M. Elad, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Review, 51 (2009), 34-81.  doi: 10.1137/060657704. [4] D. Calvetti, H. Hakula, S. Pursiainen and E. Somersalo, Conditionally Gaussian hypermodels for cerebral source localization, SIAM Journal on Imaging Sciences, 2 (2009), 879-909.  doi: 10.1137/080723995. [5] D. Calvetti, F. Pitolli, J. Prezioso, E. Somersalo and B. Vantaggi, Priorconditioned CGLS-based quasi-MAP estimate, statistical stopping rule, and ranking of priors, SIAM Journal of Scientific Computing, 39 (2017), S477–S500. doi: 10.1137/16M108272X. [6] D. Calvetti, A. Pascarella, F. Pitolli, E. Somersalo and B. Vantaggi, Brain activity mapping from MEG data via a hierarchical Bayesian algorithm with automatic depth weighting, Brain Topography, 32 (2019), 363-393.  doi: 10.1007/s10548-018-0670-7. [7] D. Calvetti, F. Pitolli, E. Somersalo and B. Vantaggi, Bayes meets Krylov: Statistically inspired preconditioners for CGLS, SIAM Review, 60 (2018), 429-461.  doi: 10.1137/15M1055061. [8] D. Calvetti, M. Pragliola and E. Somersalo, Sparsity promoting hybrid solvers for hierarchical Bayesian inverse problems, SIAM Journal on Scientific Computing 42 (2020), A3761–A3784. doi: 10.1137/20M1326246. [9] D. Calvetti, M. Pragliola, E. Somersalo and A. Strang, Sparse reconstructions from few noisy data: Analysis of hierarchical Bayesian models with generalized gamma hyperpriors, Inverse Problems, 36 (2020), 025010. doi: 10.1088/1361-6420/ab4d92. [10] D. Calvetti, E. Somersalo and A. Strang, Hierachical Bayesian models and sparsity: $\ell_2$-magic, Inverse Problems, 35 (2019), 035003. doi: 10.1088/1361-6420/aaf5ab. [11] E. J. Candes, J. Romberg and T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, 52 (2006), 489-509.  doi: 10.1109/TIT.2005.862083. [12] E. J. Candes, Y. C. Eldar, D. Needell and P. Randall, Compressed sensing with coherent and redundant dictionaries, Applied and Computational Harmonic Analysis, 31 (2011), 59-73.  doi: 10.1016/j.acha.2010.10.002. [13] A. Chambolle, M. Holler and T. Pock, A convex variational model for learning convolutional image atoms from incomplete data, Journal of Mathematical Imaging and Vision, 62 (2020), 417-444.  doi: 10.1007/s10851-019-00919-7. [14] G. Chen and D. Needell, Compressed sensing and dictionary learning, Finite Frame Theory, Proceedings of Symposia in Applied Mathematics, Vol. 73, Providence, RI, (2016), 201–241. [15] S. S. Chen, D. L. Donoho and and M. A. Saunders, Atomic decomposition by basis pursuit, SIAM Journal on Scientific Computing, 20 (1998), 33-61.  doi: 10.1137/S1064827596304010. [16] D. Ma, V. Gulani, N. Seiberlich, K. Liu, J. L. Sunshine, J. L. Duerk and M. A. Griswold, Magnetic resonance fingerprinting, Nature, 495 (2013), 187-192.  doi: 10.1038/nature11971. [17] S. G. Mallat and Z. Zhang, Matching pursuits with time-frequency dictionaries, IEEE Transactions on Signal Processing, 41 (1993), 3397-3415. [18] R. Rubinstein, A. M. Bruckstein and M. Elad, Dictionaries for sparse representation modeling, Proceedings of the IEEE, 98 (2010), 1045-1057.  doi: 10.1109/JPROC.2010.2040551. [19] J. Starck, J. Fadili and F. J. Murtagh, The undecimated wavelet decomposition and its reconstruction, IEEE Transactions on Image Processing, 16 (2007), 297-309.  doi: 10.1109/TIP.2006.887733. [20] J. L. Starck, M. Elad and D. Donoho, Redundant multiscale transforms and their application for morphological component separation, Advances in Imaging and Electron Physics, 132 (2004), 287-348.  doi: 10.1016/S1076-5670(04)32006-9. [21] A. F. Vidal, V. De Bortoli, M. Pereyra and A. Durmus, Maximum likelihood estimation of regularization parameters in high-dimensional inverse problems: An empirical Bayesian approach Part Ⅰ: Methodology and experiments, SIAM Journal on Imaging Sciences, 13 (2020), 1945-1989.  doi: 10.1137/20M1339829.

Figures(14)