\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

A review of algorithms for distributionally robust optimization using statistical distances

  • *Corresponding author: Suhong Jiang

    *Corresponding author: Suhong Jiang 

The first author and fifth author are supported by the National Natural Science Foundation of China (Grant Nos. 12122107, 72394363/72394360). The second author is supported by the National Natural Science Foundation of China (Grant No. 12101297).

Abstract / Introduction Full Text(HTML) Figure(2) / Table(1) Related Papers Cited by
  • Distributionally robust optimization (DRO) utilizing statistical distances has witnessed significant advancements, driven by its appealing properties and promising applications. However, its computational complexity poses a challenge, especially in large-scale scenarios. To address this challenge, extensive research efforts have been directed towards developing algorithms tailored for DRO problems utilizing statistical distances. This paper focuses on exploring solution methodologies for DRO problems mainly incorporating both $ \phi $-divergences and Wasserstein metric, providing a comprehensive survey of existing algorithmic frameworks.

    Mathematics Subject Classification: Primary: 90-02, 90C06, 90C17, 90C47.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  The framework of algorithm design for $ \phi $-divergence DRO

    Figure 2.  The framework of algorithm design for Wasserstein DRO

    Table 1.  Examples of $ \phi $-divergence function and its conjugate function

    Divergence $ \phi(t) $ $ \phi(t), t\geq 0 $ $ I_\phi (p, q) $ $ \phi^*(s) $
    Kullback-Leibler $ \phi_{kl} $ $ t\log t -t +1 $ $ \sum p_i\log (\frac{p_i}{q_i}) $ $ e^s-1 $
    $ \chi^2 $-distance $ \phi_{\chi^2} $ $ \frac{1}{t}{(t-1)}^2 $ $ \sum \frac{{(p_i-q_i)}^2}{p_i} $ $ 2-2\sqrt{1-s}, s<1 $
    Likelihood $ \phi_l $ $ -\log t $ $ \sum q_i \log \frac{q_i}{p_i} $ $ -\log(-s)-1, s<0 $
    Burg entropy $ \phi_b $ $ -\log t + t -1 $ $ \sum q_i \log \frac{q_i}{p_i} $ $ -\log(1-s), s<1 $
    Modified $ \chi^2 $-distance $ \phi_{m\chi^2} $ $ {(t-1)}^2 $ $ \sum \frac{{(p_i-q_i)}^2}{q_i} $ $ \begin{cases}-1, & s<-2\\ s+\frac{s^{2}}{4}, &s \geq-2\end{cases} $
    Variation distance $ \phi_{v} $ $ \vert t-1\vert $ $ \sum \vert p_i-q_i\vert $ $ \begin{cases}-1, &s\leq-1\\ s, &-1\leq s\leq 1\end{cases} $
    $ J $-divergence $ \phi_J $ $ (t-1)\log t $ $ \sum (p_i-q_i)log\left(\frac{p_i}{q_i}\right) $ No closed form
    Hellinger distance $ \phi_h $ $ (\sqrt{t}-1)^2 $ $ \sum(\sqrt{p_i}-\sqrt{q_i})^2 $ $ \frac{s}{1-s}, s<1 $
     | Show Table
    DownLoad: CSV
  • [1] M. BansalK. Huang and S. Mehrotra, Decomposition algorithms for two-stage distributionally robust mixed binary programs, SIAM Journal on Optimization, 28 (2018), 2360-2383.  doi: 10.1137/17M1115046.
    [2] G. Bayraksan and D. K. Love, Data-driven stochastic programming using phi-divergences, In The Operations Research Revolution, INFORMS, (2015), 1-19.
    [3] A. Ben-TalD. Den HertogA. De WaegenaereB. Melenberg and G. Rennen, Robust solutions of optimization problems affected by uncertain probabilities, Management Science, 59 (2013), 341-357.  doi: 10.1287/mnsc.1120.1641.
    [4] A. Ben-Tal and A. Nemirovski, Robust solutions of linear programming problems contaminated with uncertain data, Mathematical Programming, 88 (2000), 411-424.  doi: 10.1007/PL00011380.
    [5] J. BlanchetL. Chen and X. Y. Zhou, Distributionally robust mean-variance portfolio selection with Wasserstein distances, Management Science, 68 (2022), 6382-6410.  doi: 10.1287/mnsc.2021.4155.
    [6] J. Blanchet and K. Murthy, Quantifying distributional model risk via optimal transport, Mathematics of Operations Research, 44 (2019), 565-600.  doi: 10.1287/moor.2018.0936.
    [7] J. BlanchetK. Murthy and F. Zhang, Optimal transport-based distributionally robust optimization: Structural properties and iterative schemes, Mathematics of Operations Research, 47 (2022), 1500-1529.  doi: 10.1287/moor.2021.1178.
    [8] S. Bubeck and N. Cesa-Bianchi, Regret analysis of stochastic and nonstochastic multi-armed bandit problems, Foundations and Trends in Machine Learning, 5 (2012), 101-112. 
    [9] G. C. Calafiore, Ambiguous risk measures and optimal robust portfolios, SIAM Journal on Optimization, 18 (2007), 853-877.  doi: 10.1137/060654803.
    [10] R. ChenI. C. Paschalidis and an d others, Distributionally robust learning, Foundations and Trends in Optimization, 4 (2020), 1-243.  doi: 10.1561/2400000026.
    [11] Y. ChenQ. GuoH. SunZ. LiW. Wu and Z. Li, A distributionally robust optimization model for unit commitment based on Kullback–Leibler divergence, IEEE Transactions on Power Systems, 33 (2018), 5147-5160.  doi: 10.1109/TPWRS.2018.2797069.
    [12] Y. ChenH. Sun and H. Xu, Decomposition and discrete approximation methods for solving two-stage distributionally robust optimization problems, Computational Optimization and Applications, 78 (2021), 205-238.  doi: 10.1007/s10589-020-00234-7.
    [13] Z. ChenD. Kuhn and W. Wiesemann, Data-driven chance constrained programs over Wasserstein balls, Operations Research, 72 (2024), 410-424.  doi: 10.1287/opre.2022.2330.
    [14] W. de Oliveira, Risk-averse stochastic programming and distributionally robust optimization via operator splitting, Set-Valued and Variational Analysis, Springer, 29 (2021), 861-891. doi: 10.1007/s11228-021-00600-5.
    [15] E. Delage and Y. Ye, Distributionally robust optimization under moment uncertainty with application to data-driven problems, Operations Research, 58 (2010), 595-612.  doi: 10.1287/opre.1090.0741.
    [16] J. Duchi and H. Namkoong, Variance-based regularization with convex objectives, The Journal of Machine Learning Research, 20 (2019), 2450-2504. 
    [17] N. Fournier and A. Guillin, On the rate of convergence in Wasserstein distance of the empirical measure, Probability Theory and Related Fields, 162 (2015), 707-738.  doi: 10.1007/s00440-014-0583-7.
    [18] C. FrognerS. ClaiciE. Chien and J. Solomon, Incorporating unlabeled data into distributionally robust learning, Journal of Machine Learning Research, 22 (2021), 1-46. 
    [19] H. Gangammanavar and M. Bansal, Stochastic decomposition method for two-stage distributionally robust optimization, preprint, 2020, arXiv: 2011.08376.
    [20] R. GaoX. Chen and A. J. Kleywegt, Wasserstein distributionally robust optimization and variation regularization, Operations Research, 72 (2024), 1177-1191.  doi: 10.1287/opre.2022.2383.
    [21] R. Gao and A. Kleywegt, Distributionally robust stochastic optimization with Wasserstein distance, Mathematics of Operations Research, 48 (2023), 603-655.  doi: 10.1287/moor.2022.1275.
    [22] S. Ghosh and H. Lam, Robust analysis in stochastic simulation: Computation and performance guarantees, Operations Research, 67 (2019), 232-249, INFORMS.
    [23] S. Ghosh and M. Squillante, Unbiased gradient estimation for distributionally robust learning, preprint, 2020, arXiv: 2012.12367.
    [24] S. Ghosh, M. Squillante and E. Wollega, Efficient stochastic gradient descent for learning with distributionally robust optimization, preprint, 2018, arXiv: 1805.08728.
    [25] J. Goh and M. Sim, Distributionally robust optimization and its tractable approximations, Operations Research, 58 (2010), 902-917.  doi: 10.1287/opre.1090.0795.
    [26] A. R. Hota, A. Cherukuri and J. Lygeros, Data-driven chance constrained optimization under Wasserstein ambiguity sets, In 2019 American Control Conference (ACC), IEEE, (2019), 1501-1506.
    [27] Z. Hu and L. J. Hong, Kullback-Leibler divergence constrained distributionally robust optimization, Optimization Online, 2013, https://optimization-online.org/?p = 12225.
    [28] R. HuangS. QuX. Yang and Z. Liu, Multi-stage distributionally robust optimization with risk aversion, Journal of Industrial and Management Optimization, 17 (2021), 233. 
    [29] H. Husain, V. Nguyen, and A. van den Hengel, Distributionally robust bayesian optimization with $\phi$-divergences, Advances in Neural Information Processing Systems, 36 (2024).
    [30] R. Ji and M. A. Lejeune, Data-driven optimization of reward-risk ratio measures, INFORMS Journal on Computing, 33 (2021), 1120-1137.  doi: 10.1287/ijoc.2020.1002.
    [31] R. Jiang and Y. Guan, Risk-averse two-stage stochastic program with distributional ambiguity, Operations Research, 66 (2018), 1390-1405.  doi: 10.1287/opre.2018.1729.
    [32] R. Jiang and Y. Guan, Data-driven chance constrained stochastic program, Mathematical Programming, 158 (2016), 291-327.  doi: 10.1007/s10107-015-0929-7.
    [33] K. Kim and S. Mehrotra, A two-stage stochastic integer programming approach to integrated staffing and scheduling with application to nurse management, Operations Research, 63 (2015), 1431-1451.  doi: 10.1287/opre.2015.1421.
    [34] D. Kuhn, P. Mohajerin Esfahani, V. A. Nguyen and S. Shafieezadeh-Abadeh, Wasserstein distributionally robust optimization: Theory and applications in machine learning, In Operations Research and Management Science in the Age of Analytics, (2019), 130-166.
    [35] C. Lee and S. Mehrotra, A distributionally-robust approach for finding support vector machines, Optimization Online, 2015, https://optimization-online.org/?p = 13477.
    [36] D. Levy, Y. Carmon, J. C. Duchi and A. Sidford, Large-Scale methods for distributionally robust optimization, Advances in Neural Information Processing Systems, 33 (2020).
    [37] J. Li, C. Chen and A. Man-Cho So, Fast epigraphical projection-based incremental algorithms for wasserstein distributionally robust support vector machine, Advances in Neural Information Processing Systems, 33 (2020).
    [38] J. Li, S. Huang and A. Man-Cho So, A first-order algorithmic framework for Wasserstein distributionally robust logistic regression, In Proceedings of the 33rd International Conference on Neural Information Processing Systems, (2019), 3937-3947.
    [39] F. LinX. Fang and Z. Gao, Distributionally robust optimization: A review on theory and applications, Numerical Algebra, Control and Optimization, 12 (2022), 159-212.  doi: 10.3934/naco.2021057.
    [40] Y. LiuX. YuanS. Zeng and J. Zhang, Primal–dual hybrid gradient method for distributionally robust optimization problems, Operations Research Letters, 45 (2017), 625-630.  doi: 10.1016/j.orl.2017.10.001.
    [41] D. Love and G. Bayraksan, Phi-divergence constrained ambiguous stochastic programs for data-driven optimization, Technical report, Department of Integrated Systems Engineering, The Ohio State University, Columbus, Ohio, 2015.
    [42] F. Luo and S. Mehrotra, Decomposition algorithm for distributionally robust optimization using Wasserstein metric with an application to a class of regression models, European Journal of Operational Research, 278 (2019), 20-35.  doi: 10.1016/j.ejor.2019.03.008.
    [43] A. Madry, A. Makelov, L. Schmidt, D. Tsipras and A. Vladu, Towards deep learning models resistant to adversarial attacks, In International Conference on Learning Representations, 2018.
    [44] P. Mohajerin Esfahani and D. Kuhn, Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations, Mathematical Programming, 171 (2018), 115-166.  doi: 10.1007/s10107-017-1172-1.
    [45] H. Namkoong and J. C. Duchi, Stochastic gradient methods for distributionally robust optimization with f-divergences, In Proceedings of the 30th International Conference on Neural Information Processing Systems, (2016), 2216-2224.
    [46] A. NemirovskiA. JuditskyG. Lan and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM Journal on Optimization, 19 (2009), 1574-1609.  doi: 10.1137/070704277.
    [47] L. Pardo, Statistical Inference Based on Divergence Measures, Stat. Textb. Monogr., 185 Chapman & Hall/CRC, Boca Raton, FL, 2006.
    [48] I. Popescu, Robust mean-covariance solutions for stochastic optimization, Operations Research, 55 (2007), 98-112.  doi: 10.1287/opre.1060.0353.
    [49] J. Quionero-CandelaM. SugiyamaA. Schwaighofer and  N. D. LawrenceDataset Shift in Machine Learning, The MIT Press, 2009. 
    [50] H. RahimianG. Bayraksan and T. Homem-de-Mello, Identifying effective scenarios in distributionally robust stochastic programs with total variation distance, Mathematical Programming, 173 (2019), 393-430.  doi: 10.1007/s10107-017-1224-6.
    [51] H. Rahimian and S. Mehrotra, Distributionally robust optimization: A review, preprint, 2019, arXiv: 1908.05659.
    [52] R. Reemtsen and J. J. Rückmann, Semi-Infinite Programming, Springer Science & Business Media, 2013.
    [53] H. Scarf, K. Arrow and S. Karlin, A min-max solution of an inventory problem, in Studies in the Mathematical Theory of Inventory and Production, Stanford University Press, Stanford, CA, 10 (1958), 201-209.
    [54] S. Shafieezadeh-AbadehD. Kuhn and P. Mohajerin Esfahani, Regularization via mass transportation, Journal of Machine Learning Research, 20 (2019), 1-68. 
    [55] S. Shafieezadeh-Abadeh, P. Mohajerin Esfahani and D. Kuhn, Distributionally robust logistic regression, In Proceedings of the 28th International Conference on Neural Information Processing Systems, (2015), 1576-1584.
    [56] A. Sinha, H. Namkoong and J. Duchi, Certifying some distributional robustness with principled adversarial training, In International Conference on Learning Representations, 2018.
    [57] Z. WangP. W. Glynn and Y. Ye, Likelihood robust optimization for data-driven problems, Computational Management Science, 13 (2016), 241-261.  doi: 10.1007/s10287-015-0240-3.
    [58] W. WiesemannD. Kuhn and M. Sim, Distributionally robust convex optimization, Operations Research, 62 (2014), 1358-1376.  doi: 10.1287/opre.2014.1314.
    [59] T. Xia, J. Liu and A. Lisser, Distributionally robust chance constrained Markov decision process with Kullback-Leibler divergence, preprint, 2023, arXiv: 2305.02167.
    [60] W. Xie, On distributionally robust chance constrained programs with Wasserstein distance, Mathematical Programming, 186 (2021), 115-155.  doi: 10.1007/s10107-019-01445-5.
    [61] W. Xie and S. Ahmed, Bicriteria approximation of chance-constrained covering problems, Operations Research, 68 (2020), 516-533. 
    [62] Y. Yu, T. Lin, E. V. Mazumdar and M. Jordan, Fast distributionally robust learning with variance-reduced min-max optimization, In International Conference on Artificial Intelligence and Statistics, (2022), 1219-1250.
    [63] S. ZhuL. XieM. ZhangR. Gao and Y. Xie, Distributionally robust weighted k-nearest neighbors, Advances in Neural Information Processing Systems, 35 (2022), 29088-29100. 
    [64] Z. ZhangS. Ahmed and G. Lan, Efficient algorithms for distributionally robust stochastic optimization with discrete scenario support, SIAM Journal on Optimization, 31 (2021), 1690-1721.  doi: 10.1137/19M1290115.
    [65] C. Zhao and Y. Guan, Data-driven risk-averse two-stage stochastic program with $\zeta$-structure probability metrics, Available on Optimization Online, 2 (2015), 1-40. 
    [66] Y. ZhaoY. Liu and X. Yang, Distributionally robust reward-risk ratio programming with Wasserstein metric, Pacific Journal on Optimization, 15 (2019), 69-90. 
  • 加载中

Figures(2)

Tables(1)

SHARE

Article Metrics

HTML views(7635) PDF downloads(300) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return