June  2014, 7(3): 503-523. doi: 10.3934/dcdss.2014.7.503

Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO

1. 

Electrical Engineering and Computer Science, UC Berkeley, United States, United States

2. 

Mechanical Engineering, UC Berkeley, United States, United States

3. 

Electrical Engineering and Computer Science, Civil and Environmental Engineering, UC Berkeley, United States

Received  May 2013 Revised  August 2013 Published  January 2013

The LASSO is a widely used shrinkage and selection method for linear regression. We propose a generalization of the LASSO in which the $l_1$ penalty is applied on a linear transformation of the regression parameters, allowing to input prior information on the structure of the problem and to improve interpretability of the results. We also study time varying system with an $l_1$-penalty on the variations of the state, leading to estimates that exhibit few ``jumps''. We propose a homotopy algorithm that updates the solution as additional measurements are available. The algorithm takes advantage of the sparsity of the solution for computational efficiency and is promising for mining large datasets. The algorithm is implemented on three experimental data sets representing applications to traffic estimation from sparsely sampled probe vehicles, flow estimation in tidal channels and text analysis of on-line news.
Citation: Aude Hofleitner, Tarek Rabbani, Mohammad Rafiee, Laurent El Ghaoui, Alex Bayen. Learning and estimation applications of an online homotopy algorithm for a generalization of the LASSO. Discrete & Continuous Dynamical Systems - S, 2014, 7 (3) : 503-523. doi: 10.3934/dcdss.2014.7.503
References:
[1]

J. Anderson and M. Mierzwa, An introduction to the Delta Simulation Model II (DSM2) for simulation of hydrodynamics and water quality of the Sacramento-San Joaquin Delta,, in Office of State Water Project Planning, (2002).

[2]

F. Bach, R. Jenatton, J. Mairal and G. Obozinski, Convex optimization with sparsity-inducing norms,, in Optimization for Machine Learning, (2011), 19.

[3]

R. Baraniuk, Compressive sensing,, IEEE Signal. Proc. Mag., 24 (2007). doi: 10.1109/CISS.2008.4558479.

[4]

A. Bayen, J. Butler and A. Patire, et al., Mobile Millennium,, Final Report, (2011), 2011.

[5]

D. Bertsekas, Incremental gradient, subgradient, and proximal methods for convex optimization: A survey,, Optimization for Machine Learning, (2011).

[6]

E. Candès, Compressive sampling,, Congress of Mathematicians, 3 (2006), 1433.

[7]

V. Chow, Open-channel Hydraulics,, McGraw-Hill Book Company, (1988).

[8]

J. Cunge, F. Holly and A. Verwey, Practical Aspects of Computational River Hydraulics,, Pitman, (1980).

[9]

I. Daubechies, M. Defrise and C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,, Communications on Pure and Applied Mathematics, 57 (2004), 1413. doi: 10.1002/cpa.20042.

[10]

Y. Dodge, Statistical Data Analysis Based on the $l_1$-Norm and Related Methods,, Birkhäuser Verlag, (2002). doi: 10.1007/978-3-0348-8201-9.

[11]

I. Drori and D. Donoho, Solution of $l_1$ minimization problems by LARS/homotopy methods,, in International Conference on Acoustics, (2006). doi: 10.1109/ICASSP.2006.1660734.

[12]

B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least angle regression,, Ann. Statist., 32 (2004), 407. doi: 10.1214/009053604000000067.

[13]

M. Figueiredo and R. Nowak, A bound optimization approach to wavelet-based image deconvolution,, in International Conference on Image Processing, (2005). doi: 10.1109/ICIP.2005.1530172.

[14]

M. Figueiredo, R. Nowak and S. Wright, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,, IEEE Journal of selected topics in signal processing, 1 (2008), 586. doi: 10.1109/JSTSP.2007.910281.

[15]

J. Friedman, T. Hastie, H. Höfling and R. Tibshirani, Pathwise coordinate optimization,, Ann. Appl. Stat., 1 (2007), 302. doi: 10.1214/07-AOAS131.

[16]

J. Fuchs, On sparse representations in arbitrary redundant bases,, IEEE Transactions on Information Theory, 50 (2004), 1341. doi: 10.1109/TIT.2004.828141.

[17]

P. Garrigues and L. El Ghaoui, An homotopy algorithm for the Lasso with online observations,, in Neural Information Processing Systems, 21 (2008).

[18]

B. Gawalt, J. Jia, L. Miratrix, L. El Ghaoui, B. Yu and S. Clavier, Discovering word associations in news media via feature selection and sparse classification,, in Proceedings of the International Conference on Multimedia Information Retrieval, (2010), 211. doi: 10.1145/1743384.1743421.

[19]

G. Golub and C. Van Loan, Matrix Computations,, Third edition, (1996).

[20]

I. Guyon and A. Elisseeff, An introduction to variable and feature selection,, The Journal of Machine Learning Research, 3 (2003), 1157.

[21]

A. Hofleitner, R. Herring, P. Abbeel and A. Bayen, Learning the dynamics of arterial traffic from probe data using a Dynamic Bayesian Network,, IEEE Transactions on Intelligent Transportation Systems, 13 (2012), 1679. doi: 10.1109/TITS.2012.2200474.

[22]

A. Hofleitner, T. Rabbani, L. El Ghaoui and A. Bayen, Online homotopy algorithm for a generalization of the LASSO,, IEEE Transactions on Automatic Control, 58 (2013), 3175. doi: 10.1109/TAC.2013.2259373.

[23]

T. Hunter, T. Moldovan, M. Zaharia, S. Merzgui, J. Ma, M. J. Franklin, P. Abbeel and A. M. Bayen, Scaling the Mobile Millennium system in the cloud,, in Proceedings of the 2nd ACM Symposium on Cloud Computing, (2011), 1. doi: 10.1145/2038916.2038944.

[24]

S. Kim, K. Koh, S. Boyd and D. Gorinevsky, $l_1 $ trend filtering,, SIAM Rev., 51 (2009), 339. doi: 10.1137/070690274.

[25]

S. Kim, K. Koh, M. Lustig, S. Boyd and D. Gorinevsky, An Interior-Point Method for Large-Scale $l_1$-Regularized Least Squares,, IEEE Journal of Selected Topics in Signal Processing, 1 (2008), 606.

[26]

K. Knight and W. Fu, Asymptotics for lasso-type estimators,, Annals of Statistics, 28 (2000), 1356. doi: 10.1214/aos/1015957397.

[27]

H. Lee, A. Battle, R. Raina and A. Ng, Efficient sparse coding algorithms,, Advances in Neural Information Processing Systems, 19 (2007).

[28]

X. Litrico and V. Fromion, Boundary control of linearized Saint-Venant equations oscillating modes,, Automatica, 42 (2006), 967. doi: 10.1016/j.automatica.2006.02.002.

[29]

X. Litrico and V. Fromion, Modeling and Control of Hydrosystems,, Springer, (2009). doi: 10.1007/978-1-84882-624-3.

[30]

X. Litrico and V. Fromion, Frequency modeling of open-channel flow,, ASCE Journal of Journal of Hydraulic Engineering, 130 (2004), 806. doi: 10.1061/(ASCE)0733-9429(2004)130:8(806).

[31]

I. Loris, On the performance of algorithms for the minimization of $l_1$-penalized functionals,, Inverse Problems, 25 (2009). doi: 10.1088/0266-5611/25/3/035008.

[32]

J. Mairal and B. Yu, Complexity analysis of the Lasso regularization path,, in International Conference on Machine Learning, (2012).

[33]

A. Ng, Feature selection, $l_1$ vs. $l_2$ regularization, and rotational invariance,, in 21st International Conference on Machine learning, (2004).

[34]

M. Osborne, B. Presnell and B. Turlach, A new approach to variable selection in least squares problems,, IMA journal of numerical analysis, 20 (2000), 389. doi: 10.1093/imanum/20.3.389.

[35]

S. Rosset and J. Zhu, Piecewise linear regularized solution paths,, The Annals of Statistics, 35 (2007), 1012. doi: 10.1214/009053606000001370.

[36]

M. Salman Asif and J. Romberg, Dynamic Updating for $l_1$ regularization,, IEEE Journal of Selected Topics in Signal Processing, 4 (2010), 421.

[37]

R. Tibshirani, Regression shrinkage and selection via the Lasso,, Journal of the Royal Statistical Society. Series B (Methodological), 58 (1996), 267.

[38]

H. Zou and T. Hastie, Regularization and variable selection via the elastic net,, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67 (2005), 301. doi: 10.1111/j.1467-9868.2005.00503.x.

show all references

References:
[1]

J. Anderson and M. Mierzwa, An introduction to the Delta Simulation Model II (DSM2) for simulation of hydrodynamics and water quality of the Sacramento-San Joaquin Delta,, in Office of State Water Project Planning, (2002).

[2]

F. Bach, R. Jenatton, J. Mairal and G. Obozinski, Convex optimization with sparsity-inducing norms,, in Optimization for Machine Learning, (2011), 19.

[3]

R. Baraniuk, Compressive sensing,, IEEE Signal. Proc. Mag., 24 (2007). doi: 10.1109/CISS.2008.4558479.

[4]

A. Bayen, J. Butler and A. Patire, et al., Mobile Millennium,, Final Report, (2011), 2011.

[5]

D. Bertsekas, Incremental gradient, subgradient, and proximal methods for convex optimization: A survey,, Optimization for Machine Learning, (2011).

[6]

E. Candès, Compressive sampling,, Congress of Mathematicians, 3 (2006), 1433.

[7]

V. Chow, Open-channel Hydraulics,, McGraw-Hill Book Company, (1988).

[8]

J. Cunge, F. Holly and A. Verwey, Practical Aspects of Computational River Hydraulics,, Pitman, (1980).

[9]

I. Daubechies, M. Defrise and C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,, Communications on Pure and Applied Mathematics, 57 (2004), 1413. doi: 10.1002/cpa.20042.

[10]

Y. Dodge, Statistical Data Analysis Based on the $l_1$-Norm and Related Methods,, Birkhäuser Verlag, (2002). doi: 10.1007/978-3-0348-8201-9.

[11]

I. Drori and D. Donoho, Solution of $l_1$ minimization problems by LARS/homotopy methods,, in International Conference on Acoustics, (2006). doi: 10.1109/ICASSP.2006.1660734.

[12]

B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least angle regression,, Ann. Statist., 32 (2004), 407. doi: 10.1214/009053604000000067.

[13]

M. Figueiredo and R. Nowak, A bound optimization approach to wavelet-based image deconvolution,, in International Conference on Image Processing, (2005). doi: 10.1109/ICIP.2005.1530172.

[14]

M. Figueiredo, R. Nowak and S. Wright, Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems,, IEEE Journal of selected topics in signal processing, 1 (2008), 586. doi: 10.1109/JSTSP.2007.910281.

[15]

J. Friedman, T. Hastie, H. Höfling and R. Tibshirani, Pathwise coordinate optimization,, Ann. Appl. Stat., 1 (2007), 302. doi: 10.1214/07-AOAS131.

[16]

J. Fuchs, On sparse representations in arbitrary redundant bases,, IEEE Transactions on Information Theory, 50 (2004), 1341. doi: 10.1109/TIT.2004.828141.

[17]

P. Garrigues and L. El Ghaoui, An homotopy algorithm for the Lasso with online observations,, in Neural Information Processing Systems, 21 (2008).

[18]

B. Gawalt, J. Jia, L. Miratrix, L. El Ghaoui, B. Yu and S. Clavier, Discovering word associations in news media via feature selection and sparse classification,, in Proceedings of the International Conference on Multimedia Information Retrieval, (2010), 211. doi: 10.1145/1743384.1743421.

[19]

G. Golub and C. Van Loan, Matrix Computations,, Third edition, (1996).

[20]

I. Guyon and A. Elisseeff, An introduction to variable and feature selection,, The Journal of Machine Learning Research, 3 (2003), 1157.

[21]

A. Hofleitner, R. Herring, P. Abbeel and A. Bayen, Learning the dynamics of arterial traffic from probe data using a Dynamic Bayesian Network,, IEEE Transactions on Intelligent Transportation Systems, 13 (2012), 1679. doi: 10.1109/TITS.2012.2200474.

[22]

A. Hofleitner, T. Rabbani, L. El Ghaoui and A. Bayen, Online homotopy algorithm for a generalization of the LASSO,, IEEE Transactions on Automatic Control, 58 (2013), 3175. doi: 10.1109/TAC.2013.2259373.

[23]

T. Hunter, T. Moldovan, M. Zaharia, S. Merzgui, J. Ma, M. J. Franklin, P. Abbeel and A. M. Bayen, Scaling the Mobile Millennium system in the cloud,, in Proceedings of the 2nd ACM Symposium on Cloud Computing, (2011), 1. doi: 10.1145/2038916.2038944.

[24]

S. Kim, K. Koh, S. Boyd and D. Gorinevsky, $l_1 $ trend filtering,, SIAM Rev., 51 (2009), 339. doi: 10.1137/070690274.

[25]

S. Kim, K. Koh, M. Lustig, S. Boyd and D. Gorinevsky, An Interior-Point Method for Large-Scale $l_1$-Regularized Least Squares,, IEEE Journal of Selected Topics in Signal Processing, 1 (2008), 606.

[26]

K. Knight and W. Fu, Asymptotics for lasso-type estimators,, Annals of Statistics, 28 (2000), 1356. doi: 10.1214/aos/1015957397.

[27]

H. Lee, A. Battle, R. Raina and A. Ng, Efficient sparse coding algorithms,, Advances in Neural Information Processing Systems, 19 (2007).

[28]

X. Litrico and V. Fromion, Boundary control of linearized Saint-Venant equations oscillating modes,, Automatica, 42 (2006), 967. doi: 10.1016/j.automatica.2006.02.002.

[29]

X. Litrico and V. Fromion, Modeling and Control of Hydrosystems,, Springer, (2009). doi: 10.1007/978-1-84882-624-3.

[30]

X. Litrico and V. Fromion, Frequency modeling of open-channel flow,, ASCE Journal of Journal of Hydraulic Engineering, 130 (2004), 806. doi: 10.1061/(ASCE)0733-9429(2004)130:8(806).

[31]

I. Loris, On the performance of algorithms for the minimization of $l_1$-penalized functionals,, Inverse Problems, 25 (2009). doi: 10.1088/0266-5611/25/3/035008.

[32]

J. Mairal and B. Yu, Complexity analysis of the Lasso regularization path,, in International Conference on Machine Learning, (2012).

[33]

A. Ng, Feature selection, $l_1$ vs. $l_2$ regularization, and rotational invariance,, in 21st International Conference on Machine learning, (2004).

[34]

M. Osborne, B. Presnell and B. Turlach, A new approach to variable selection in least squares problems,, IMA journal of numerical analysis, 20 (2000), 389. doi: 10.1093/imanum/20.3.389.

[35]

S. Rosset and J. Zhu, Piecewise linear regularized solution paths,, The Annals of Statistics, 35 (2007), 1012. doi: 10.1214/009053606000001370.

[36]

M. Salman Asif and J. Romberg, Dynamic Updating for $l_1$ regularization,, IEEE Journal of Selected Topics in Signal Processing, 4 (2010), 421.

[37]

R. Tibshirani, Regression shrinkage and selection via the Lasso,, Journal of the Royal Statistical Society. Series B (Methodological), 58 (1996), 267.

[38]

H. Zou and T. Hastie, Regularization and variable selection via the elastic net,, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67 (2005), 301. doi: 10.1111/j.1467-9868.2005.00503.x.

[1]

Yong Wang, Wanquan Liu, Guanglu Zhou. An efficient algorithm for non-convex sparse optimization. Journal of Industrial & Management Optimization, 2017, 13 (5) : 1-13. doi: 10.3934/jimo.2018134

[2]

Adil Bagirov, Sona Taheri, Soodabeh Asadi. A difference of convex optimization algorithm for piecewise linear regression. Journal of Industrial & Management Optimization, 2019, 15 (2) : 909-932. doi: 10.3934/jimo.2018077

[3]

Lipu Zhang, Yinghong Xu, Zhengjing Jin. An efficient algorithm for convex quadratic semi-definite optimization. Numerical Algebra, Control & Optimization, 2012, 2 (1) : 129-144. doi: 10.3934/naco.2012.2.129

[4]

Yigui Ou, Xin Zhou. A modified scaled memoryless BFGS preconditioned conjugate gradient algorithm for nonsmooth convex optimization. Journal of Industrial & Management Optimization, 2018, 14 (2) : 785-801. doi: 10.3934/jimo.2017075

[5]

Yanqin Bai, Lipu Zhang. A full-Newton step interior-point algorithm for symmetric cone convex quadratic optimization. Journal of Industrial & Management Optimization, 2011, 7 (4) : 891-906. doi: 10.3934/jimo.2011.7.891

[6]

Jiping Tao, Ronghuan Huang, Tundong Liu. A $2.28$-competitive algorithm for online scheduling on identical machines. Journal of Industrial & Management Optimization, 2015, 11 (1) : 185-198. doi: 10.3934/jimo.2015.11.185

[7]

Joon Kwon, Panayotis Mertikopoulos. A continuous-time approach to online optimization. Journal of Dynamics & Games, 2017, 4 (2) : 125-148. doi: 10.3934/jdg.2017008

[8]

Murat Adivar, Shu-Cherng Fang. Convex optimization on mixed domains. Journal of Industrial & Management Optimization, 2012, 8 (1) : 189-227. doi: 10.3934/jimo.2012.8.189

[9]

Ruiqi Yang, Dachuan Xu, Yicheng Xu, Dongmei Zhang. An adaptive probabilistic algorithm for online k-center clustering. Journal of Industrial & Management Optimization, 2019, 15 (2) : 565-576. doi: 10.3934/jimo.2018057

[10]

Zhenhuan Yang, Yiming Ying, Qilong Min. Online optimization for residential PV-ESS energy system scheduling. Mathematical Foundations of Computing, 2019, 2 (1) : 55-71. doi: 10.3934/mfc.2019005

[11]

Anulekha Dhara, Aparna Mehra. Conjugate duality for generalized convex optimization problems. Journal of Industrial & Management Optimization, 2007, 3 (3) : 415-427. doi: 10.3934/jimo.2007.3.415

[12]

Igor Griva, Roman A. Polyak. Proximal point nonlinear rescaling method for convex optimization. Numerical Algebra, Control & Optimization, 2011, 1 (2) : 283-299. doi: 10.3934/naco.2011.1.283

[13]

Nobuko Sagara, Masao Fukushima. trust region method for nonsmooth convex optimization. Journal of Industrial & Management Optimization, 2005, 1 (2) : 171-180. doi: 10.3934/jimo.2005.1.171

[14]

Zhongliang Deng, Enwen Hu. Error minimization with global optimization for difference of convex functions. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1027-1033. doi: 10.3934/dcdss.2019070

[15]

Jiping Tao, Zhijun Chao, Yugeng Xi. A semi-online algorithm and its competitive analysis for a single machine scheduling problem with bounded processing times. Journal of Industrial & Management Optimization, 2010, 6 (2) : 269-282. doi: 10.3934/jimo.2010.6.269

[16]

Ran Ma, Jiping Tao. An improved 2.11-competitive algorithm for online scheduling on parallel machines to minimize total weighted completion time. Journal of Industrial & Management Optimization, 2018, 14 (2) : 497-510. doi: 10.3934/jimo.2017057

[17]

Xiangyu Gao, Yong Sun. A new heuristic algorithm for laser antimissile strategy optimization. Journal of Industrial & Management Optimization, 2012, 8 (2) : 457-468. doi: 10.3934/jimo.2012.8.457

[18]

Jianjun Liu, Min Zeng, Yifan Ge, Changzhi Wu, Xiangyu Wang. Improved Cuckoo Search algorithm for numerical function optimization. Journal of Industrial & Management Optimization, 2017, 13 (5) : 1-13. doi: 10.3934/jimo.2018142

[19]

Sanming Liu, Zhijie Wang, Chongyang Liu. Proximal iterative Gaussian smoothing algorithm for a class of nonsmooth convex minimization problems. Numerical Algebra, Control & Optimization, 2015, 5 (1) : 79-89. doi: 10.3934/naco.2015.5.79

[20]

Yazheng Dang, Fanwen Meng, Jie Sun. Convergence analysis of a parallel projection algorithm for solving convex feasibility problems. Numerical Algebra, Control & Optimization, 2016, 6 (4) : 505-519. doi: 10.3934/naco.2016023

2018 Impact Factor: 0.545

Metrics

  • PDF downloads (5)
  • HTML views (0)
  • Cited by (0)

[Back to Top]