Article Contents
Article Contents

# Nonconvex mixed matrix minimization

• * Corresponding author: Xianchao Xiu
This work was supported in part by the National Natural Science Foundation of China (61633001, 11671029, 11601348).
• Given the ultrahigh dimensionality and the complex structure, which contains matrices and vectors, the mixed matrix minimization becomes crucial for the analysis of those data. Recently, the nonconvex functions such as the smoothly clipped absolute deviation, the minimax concave penalty, the capped $\ell_1$-norm penalty and the $\ell_p$ quasi-norm with $0<p<1$ have been shown remarkable advantages in variable selection due to the fact that they can overcome the over-penalization. In this paper, we propose and study a novel nonconvex mixed matrix minimization, which combines the low-rank and sparse regularzations and nonconvex functions perfectly. The augmented Lagrangian method (ALM) is proposed to solve the noncovnex mixed matrix minimization problem. The resulting subproblems either have closed-form solutions or can be solved by fast solvers, which makes the ALM particularly efficient. In theory, we prove that the sequence generated by the ALM converges to a stationary point when the penalty parameter is above a computable threshold. Extensive numerical experiments illustrate that our proposed nonconvex mixed matrix minimization model outperforms the existing ones.

Mathematics Subject Classification: Primary: 90C26, 90C46; Secondary: 90C90, 65K10.

 Citation:

• Table 1.  Comparison Results for Synthetic Data

 Cases RMSE(B) RMSE($\gamma$) $r$ $s$ LMM LSMM NonMM LMM LSMM NonMM 5 0.01 0.0000 0.0000 0.0000 0.1806 0.1421 0.1101 0.05 0.0456 0.0437 0.0274 0.6453 0.2742 0.2336 0.1 0.0905 0.0688 0.0496 0.9162 0.3788 0.3176 0.2 0.1314 0.1082 0.0911 1.0822 0.5063 0.4410 0.5 0.2744 0.2152 0.1013 1.5868 0.7371 0.7092 10 0.01 0.0000 0.0000 0.0000 0.1927 0.1513 0.1217 0.05 0.0781 0.05115 0.0382 0.7663 0.3312 0.2961 0.1 0.1072 0.0749 0.0536 1.0452 0.3879 0.3401 0.2 0.1339 0.1217 0.1161 1.2696 0.5305 0.4918 0.5 0.2947 0.2440 0.1278 1.9561 0.7971 0.7330 15 0.01 0.0000 0.0000 0.0000 0.2197 0.1920 0.1405 0.05 0.0982 0.0749 0.0654 0.8408 0.3223 0.2893 0.1 0.1323 0.1130 0.1108 1.2501 0.3534 0.3293 0.2 0.1846 0.1560 0.1272 1.6730 0.4988 0.4494 0.5 0.3176 0.2812 0.2176 2.3608 0.7817 0.7502 20 0.01 0.0000 0.0000 0.0000 0.1936 0.1532 0.1509 0.05 0.0987 0.0791 0.0732 0.9081 0.3168 0.2719 0.1 0.1528 0.1454 0.1402 1.416 0.4583 0.3961 0.2 0.1904 0.1821 0.1723 1.8304 0.4722 0.4148 0.5 0.3876 0.2922 0.2134 2.4961 0.8049 0.7891 30 0.01 0.0000 0.0000 0.0000 0.1768 0.1620 0.1400 0.05 0.1071 0.0932 0.0858 1.0089 0.2360 0.2232 0.1 0.1608 0.1572 0.1469 1.5286 0.3241 0.3062 0.2 0.2487 0.2371 0.2206 2.082 0.4920 0.4599 0.5 0.4170 0.3943 0.3725 2.6203 0.7856 0.8041

Table 2.  Comparison Results for Real-world Data

 Rate LMM LSMM NonMM 5-fold 0.1321 0.1178 0.1025 10-fold 0.1972 0.1630 0.1589
•  [1] S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein, Distributed optimization and statistical learning via alternating direction method of multipliers, Foundations and Trends® in Machine Learning, 3 (2011), 1-122.  doi: 10.1561/2200000016. [2] J. Cai, E. Cand$\grave{e}$s and Z. Shen, A singular value thresholding algorithm for matrix completion, SIAM Journal on Optimization, 20 (2010), 1956-1982.  doi: 10.1137/080738970. [3] X. Chen, M. Ng and C. Zhang, Non-lipschitz $\ell_p$-regularization and box constrained model for image restoration, IEEE Transactions on Image Processing, 21 (2012), 4709-4721.  doi: 10.1109/TIP.2012.2214051. [4] Y. Chen, N. Xiu and D. Peng, Global solutions of non-Lipschitz $S_2-S_ p$ minimization over the positive semidefinite cone, Optimization Letters, 8 (2014), 2053-2064.  doi: 10.1007/s11590-013-0701-y. [5] D. Donoho, De-noising by soft-thresholding, IEEE Transactions on Information Theory, 41 (1995), 613-627.  doi: 10.1109/18.382009. [6] M. Elad, Why simple shrinkage is still relevant for redundant representations?, IEEE Transactions on Information Theory, 52 (2006), 5559-5569.  doi: 10.1109/TIT.2006.885522. [7] J. Fan, Comments on "wavelets in statistics: A review" by A. Antoniadis, Journal of the Italian Statistical Society, 6 (1997), 131-138.  doi: 10.1007/BF03178906. [8] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96 (2001), 1348-1360.  doi: 10.1198/016214501753382273. [9] J. Fan, L. Xue and H. Zou, Strong oracle optimality of folded concave penalized estimation, Annals of Statistics, 42 (2014), 819-849.  doi: 10.1214/13-AOS1198. [10] M. Fazel, T. Pong, D. Sun and P. Tseng, Hankel matrix rank minimization with applications to system identification and realization, SIAM Journal on Matrix Analysis and Applications, 34 (2013), 946-977.  doi: 10.1137/110853996. [11] D. Gabay, Chapter ix applications of the method of multipliers to variational inequalities, Studies in Mathematics and Its Applications, 15 (1983), 299-331.  doi: 10.1016/S0168-2024(08)70034-1. [12] D. Gabay and B. Mercier, A dual algorithm for the solution of nonlinear variational problems via finite element approximation, Computers and Mathematics with Applications, 2 (1976), 17-40.  doi: 10.1016/0898-1221(76)90003-1. [13] M. Hong, Z. Luo and M. Razaviyayn, Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems, SIAM Journal on Optimization, 26 (2016), 337-364.  doi: 10.1137/140990309. [14] G. Li and T. Peng., Douglass-Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems, Mathematical Programming, 159 (2016), 371-401.  doi: 10.1007/s10107-015-0963-5. [15] S. Negahban and M. Wainwright, Estimation of (near) low-rank matrices with noise and high-dimensional scaling, Annals of Statistics, 39 (2011), 1069-1097.  doi: 10.1214/10-AOS850. [16] M. Nikolova, M. Ng, S. Zhang and W. Ching, Efficient reconstruction of piecewise constant images using nonsmooth nonconvex minimization, SIAM Journal on Imaging Sciences, 1 (2008), 2-25.  doi: 10.1137/070692285. [17] G. Obozinski, M. Wainwright and M. Jordan, Support union recovery in high-dimensional multivariate regression, Annals of Statistics, 39 (2011), 1-47.  doi: 10.1214/09-AOS776. [18] L. Rudin and S. Osher, Total variation based image restoration with free local constraints, In Proceedings of the IEEE International Conference on Image Processing, 1 (1994), 31-35.  doi: 10.1109/ICIP.1994.413269. [19] R. Rockafellar and R. Wets, Variational Analysis, Springer Science and Business Media, 2009. doi: 10.1007/978-3-642-02431-3. [20] P. Shang and L. Kong, On the degrees of freedom of mixed matrix regression, Mathematical Problems in Engineering, 2017 (2017), Art. ID 6942865, 8 pp. doi: 10.1155/2017/6942865. [21] R. Tibshirani, Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B, 58 (1996), 267-288.  doi: 10.1111/j.2517-6161.1996.tb02080.x. [22] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu and K. Knight, Sparsity and smoothness via the fused Lasso, Journal of the Royal Statistical Society, Series B, 67 (2005), 91-108.  doi: 10.1111/j.1467-9868.2005.00490.x. [23] F. Wang, W. Cao and Z. Xu., Convergence of multi-block Bregman ADMM for nonconvex composite problems, Science China Information Sciences, 61 (2018), 122101, 12pp. doi: 10.1007/s11432-017-9367-6. [24] X. Xiu, L. Kong, Y. Li and H. Qi, Iterative reweighted methods for $\ell_1-\ell_p$ minimization, Computational Optimization and Applications, 70 (2018), 201-219.  doi: 10.1007/s10589-017-9977-7. [25] X. Xiu, W. Liu, L. Li and L. Kong, Alternating direction method of multipliers for nonconvex fused regression problems, Computational Statistics and Data Analysis, 136 (2019), 59-71.  doi: 10.1016/j.csda.2019.01.002. [26] L. Yang, T. Pong and X. Chen, Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction, SIAM Journal on Imaging Sciences, 10 (2017), 74-110.  doi: 10.1137/15M1027528. [27] M. Yuan, A. Ekici, Z. Lu and R. Monteiro, Dimension reduction and coefficient estimation in multivariate linear regression, Journal of Royal Statistical Society, Series B, 69 (2007), 329-346.  doi: 10.1111/j.1467-9868.2007.00591.x. [28] C. Zhang, Nearly unbiased variable selection under minimax concave penalty, Annals of Statistics, 38 (2010), 894-942.  doi: 10.1214/09-AOS729. [29] T. Zhang, Analysis of multi-stage convex relaxation for sparse regularization, Journal of Machine Learning Research, 11 (2010), 1081-1107. [30] H. Zou and T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society, Series B, 67 (2005), 301-320.  doi: 10.1111/j.1467-9868.2005.00503.x. [31] H. Zhou and L. Li, Regularized matrix regression, Journal of the Royal Statistical Society, Series B, 76 (2014), 463-483.  doi: 10.1111/rssb.12031. [32] T. Zhou and D. Tao, Godec: Randomized low-rank and sparse matrix decomposition in noisy case, International Conference on Machine Learning, 2011.

Tables(2)