Advanced Search
Article Contents
Article Contents

Optimal estimation of $\ell_1$-regularization prior from a regularized empirical Bayesian risk standpoint

Abstract Related Papers Cited by
  • We address the problem of prior matrix estimation for the solution of $\ell_1$-regularized ill-posed inverse problems. From a Bayesian viewpoint, we show that such a matrix can be regarded as an influence matrix in a multivariate $\ell_1$-Laplace density function. Assuming a training set is given, the prior matrix design problem is cast as a maximum likelihood term with an additional sparsity-inducing term. This formulation results in an unconstrained yet nonconvex optimization problem. Memory requirements as well as computation of the nonlinear, nonsmooth sub-gradient equations are prohibitive for large-scale problems. Thus, we introduce an iterative algorithm to design efficient priors for such large problems. We further demonstrate that the solutions of ill-posed inverse problems by incorporation of $\ell_1$-regularization using the learned prior matrix perform generally better than commonly used regularization techniques where the prior matrix is chosen a-priori.
    Mathematics Subject Classification: Primary: 49N45; Secondary: 62F15.


    \begin{equation} \\ \end{equation}
  • [1]

    H. Akaike, A new look at the statistical model identification, IEEE Trans. Auto. Cont., 19 (1974), 716-723.doi: 10.1109/TAC.1974.1100705.


    A. Aravkin, An $l_1$-Laplace robust kalman smoother, IEEE Trans. Auto. Cont., 56 (2011), 2898-2911.doi: 10.1109/TAC.2011.2141430.


    U. Ascher, E. Haber and H. Huang, On effective methods for implicit piecewise smooth surface recovery, SIAM J. Scient. Comput., 28 (2006), 339-358.doi: 10.1137/040617261.


    U. Ascher, K. van den Doel, H. Huang and B. Svaiter, Gradient descent and fast artificial time integration, M2AN, Math. Model. Numer. Anal., 43 (2009), 689-708.


    J. A. Bilmes, "Factored Sparse Inverse Covariance Matrices," Proc. IEEE Intl. Conf. on Acous., Spee., and Sig. Proc., 2000.


    A. M. Bruckstein, D. L. Donoho and M. Elad, From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Review, 51 (2009), 34-81.doi: 10.1137/060657704.


    E. J. Candès and D. L. Donoho, "Curvelets: A Surprisingly Effective Nonadaptive Representation for Objects with Edges," 1999.


    E. J. Candès and M. B. Wakin, An introduction to compressive sampling, IEEE In Sig. Proc. Magaz., 25 (2008), 21-30.doi: 10.1109/MSP.2007.914731.


    T. Chan and J. Shen, "Image Processing and Analysis: Variational, PDE, Wavelet, and Stochastic Methods," SIAM, 2005.


    X. Chen and W. Zhou., Smoothing nonlinear conjugate gradient method for image restoration using nonsmooth nonconvex minimization, SIAM J. Imaging Sci., 3 (2010), 765-790.doi: 10.1137/080740167.


    M. Cheney, D.Isaacson and J. C. Newell, Electrical impedance tomography, SIAM Review, 41 (1999), 85-101.doi: 10.1137/S0036144598333613.


    J. Dahl, L. Vandenberghe and V. Roychowdhury, Covariance selection for non-chordal graphs via chordal embedding, Optim. Meth. and Soft., 23 (2008), 501-520.doi: 10.1080/10556780802102693.


    Y. H. Dai and R. Fletcher, Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming, Numer. Math., 100 (2005), 21-47.doi: 10.1007/s00211-004-0569-y.


    A. d'Aspremont, O. Banerjee and L. E. Ghaoui, First-order methods for sparse covariance selection, SIAM J. Matrix Anal. Appl., 30 (2008), 56-66.doi: 10.1137/060670985.


    M. N. Do and M. Vetterli, The contourlet transform: an efficient directional multiresolution image representation, IEEE Trans. on Image Proc., 14 (2005), 2091-2106.doi: 10.1109/TIP.2005.859376.


    W. Dong, L. Zhang, G. Shi and X. Wu, Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization, IEEE Trans. Image Proc., 20 (2011), 1838-1857.doi: 10.1109/TIP.2011.2108306.


    M. Elad, P. Milanfar and R. Rubinstein, Analysis versus synthesis in signal priors, Inver. Problems, 23 (2007), 947-968.doi: 10.1088/0266-5611/23/3/007.


    K. Engan, K. Skretting and J. H. Husφy, Family of iterative ls-based dictionary learning algorithms, ils-dla, for sparse signal representation, Digit. Sig. Proc., 17 (2007), 32-49.doi: 10.1016/j.dsp.2006.02.002.


    M. A. T. Figueiredo, R. D. Nowak and S. J. Wright, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems, IEEE J. on Sig. Proc., 1 (2007), 586-597.


    M. A. T. Figueiredo, R. D. Nowak and S. J. Wright, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems, IEEE J. in Sig. Proc., 1 (2007), 586-597.


    J. Friedman, T. Hastie and R. Tibshirani, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9 (2008), 432-441.doi: 10.1093/biostatistics/kxm045.


    A. Genkin, D. Lewis and D. Madigan, Large-scale bayesian logistic regression for text categorization, Technometrics, 49 (2007), 291-304.doi: 10.1198/004017007000000245.


    M. Girolami, A variational method for learning sparse and overcomplete representations, Neu. Comput., 13 (2001), 2517-2532.doi: 10.1162/089976601753196003.


    G. H. Golub, M. T. Heath and G. Wahba, Generalized cross-validation as a method for choosing a good ridge parameter, Technometrics, 21 (1979), 215-223.doi: 10.1080/00401706.1979.10489751.


    E. Haber, U. Ascher and D. Oldenburg, Inversion of 3D electromagnetic data in frequency and time domain using an inexact all-at-once approach, Geophysics, 69 (2004), 1216-1228.


    E. Haber, D. Oldenburg and A. Celler, Recovery of dynamic parameters in SPECT, IEEE trans. on Nuc. Sci., 44 (1997), 2425-2430.


    P. C. Hansen, "Rank Deficient and Ill-Posed Problems," SIAM, Philadelphia, 1998.doi: 10.1137/1.9780898719697.


    L. Horesh and E. Haber, Sensitivity computation of the $l_1$ minimization problem and its application to dictionary design of ill-posed problems, Inver. Problems, 25 (2009), 20pp.doi: 10.1088/0266-5611/25/9/095009.


    N. Johnson and S. Kotz, "Continuous Multivariate Distributions," Wiley, New York, 1972.


    S. M. Kakade, S. Shalev-Shwartz and A. Tewari, "Regularization Techniques for Learning with Matrices," J. Mach. Learning Research, accepted, 2012.


    S. Kotz, T. J. Kozubowski and K. Podgórski, "Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance," Birkhäuser, Boston, 2001.


    K. Kreutz-Delgado, J. F. Murray, B. D. Rao, K. Engan, T. W. Lee and T. J. Sejnowski, Dictionary learning algorithms for sparse representation, Neu. Comput., 15 (2003), 349-396.doi: 10.1162/089976603762552951.


    G. Kutyniok, D. Labate, W. Q. Lim and G. Weiss, Sparse multidimensional representation using shearlets, in "Proc. of SPIE," 5914, (2005), 254-262.doi: 10.1117/12.613494.


    M. S. Lewicki and T. J. Sejnowski, Learning overcomplete representations, Neu. Comput., 12 (2000), 337-365.doi: 10.1162/089976600300015826.


    A. S. Lewis and M. L. Overton, "Nonsmooth Optimization Via BFGS," SIAM J. Optim., in press, 2011.


    Z. Lu, Smooth optimization approach for sparse covariance selection, SIAM J. Optim., 19 (2009), 807-1827.doi: 10.1137/070695915.


    J. Mairal, F. Bach, J. Ponce, G. Sapiro and A. Zisserman, Supervised dictionary learning, 21 (2009), 1033-1040.


    S. Mallat, "A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way," Academic Press, 2008.


    D. Mihovilovic and R. N. Bracewell, Adaptive chirplet representation of signals on time-frequency plane, Electronics Letters, 27 (1991), 1159-1161.doi: 10.1049/el:19910723.


    J. F. Murray and K. Kreutz-Delgado, Learning sparse overcomplete codes for images, J. VLSI Sig. Proc., 45 (2006), 97-110.doi: 10.1007/s11265-006-9774-5.


    C. Papageorgiou and T. Poggio, "A Trainable Object Detection System: Car Detection in Static Images," Technical Report 1673, mitai, (CBCL Memo 180), October 1999.


    A. Pázman, "Foundations of Optimum Experimental Design," Springer-Verlag, New York, 1986.


    E. Le Pennec and S. Mallat, Sparse geometric image representations with bandelets, IEEE Trans. on Image Proc., 14 (2005), 423-438.doi: 10.1109/TIP.2005.843753.


    G. Peyré and J. Fadili, Learning analysis sparsity priors, in "Proc. of Sampta'11," 2011.


    R. Rubinstein, M. Zibulevsky and M. Elad, Double sparsity: learning sparse dictionaries for sparse signal approximation, IEEE Trans. on Sig. Proc., 58 (2010), 1553-1564.


    G. Sapiro, "Geometric Partial Differential Equations and Image Analysis," Cambridge, 2001.


    O. Scherzer, Scale-space methods and regularization for denoising and inverse problems, Advances in Image and Elect. Phys., 128 (2003), 445-530.


    G. Schwarz, Estimating the dimension of a model, The Annals of Stat., 6 (1978), 461-464.doi: 10.1214/aos/1176344136.


    D. F. Shanno and K. Phua, Matrix conditioning and nonlinear optimization, Math. Program., 14 (1978), 149-160.doi: 10.1007/BF01588962.


    V. Sindhwani and A. Ghoting, "Large-Scale Distributed Non-Negative Sparse Coding and Sparse Dictionary Learning," SIGKDD, 2012.


    M. Sugiyama and H. Ogawa, Optimal design of regularization term and regularization parameter by subspace information criterion, Neu. Netw., 15 (2002), 349-361.doi: 10.1016/S0893-6080(02)00022-9.


    R. Tibshirani, Regression shrinkage and selection via the lasso, J. Royal. Statist. Soc B., 58 (1996), 267-288.


    A. N. Tychonoff, Solution of incorrectly formulated problems and the regularization method, Soviet Math. Dokl., 4 (1963), 1035-1038.


    E. van den Berg and M. P. Friedlander, Probing the pareto frontier for basis pursuit solutions, SIAM J. Scient. Comput., 31 (2008), 890-912.doi: 10.1137/080714488.


    V. Vapnik, "Statistical Learning Theory," Wiley, 1998.


    C. R. Vogel, "Computational Methods for Inverse Problem," SIAM, Philadelphia, 2002.doi: 10.1137/1.9780898717570.


    H. Wang, B. Li and C. Leng, Shrinkage tuning parameter selection with a diverging number of parameters, J. R. Statist. Soc. B, 71 (2009), 671-683.doi: 10.1111/j.1467-9868.2008.00693.x.


    D. P. Wipf and B. D. Rao, Sparse bayesian learning for basis selection, IEEE Trans. on Sig. Proc., 52 (2004), 2153-2164.


    Z. J. Xiang, H. Xu and P. J. Ramadge, Learning sparse representations of high dimensional data on large scale dictionaries, 24 (2011), 900-908.


    X. Chen and C. Zhang, Smoothing projected gradient method and its application to stochastic linear complementarity problems, SIAM J. Optim., 20, (2009), 627-649.doi: 10.1137/070702187.

  • 加载中

Article Metrics

HTML views() PDF downloads(140) Cited by(0)

Access History

Other Articles By Authors



    DownLoad:  Full-Size Img  PowerPoint