American Institute of Mathematical Sciences

April  2018, 5(2): 143-163. doi: 10.3934/jdg.2018009

A risk minimization problem for finite horizon semi-Markov decision processes with loss rates

 1 School of Mathematical Sciences, South China Normal University, Guangzhou 510631, China 2 School of Economics and Statistics, Guangzhou University, Guangzhou 510006, China

* Corresponding author. Emails: liuql@m.scnu.edu.cn; zouxiaol@gzhu.edu.cn

Received  June 2017 Revised  July 2017 Published  February 2018

Fund Project: Research supported by Natural Science Foundation of Guangdong Province (Grant No.2014A030313438), Zhujiang New Star (Grant No. 201506010056) and Guangdong Province outstanding young teacher training plan (Grant No. YQ2015050)

This paper deals with the risk probability for finite horizon semi-Markov decision processes with loss rates. The criterion to be minimized is the risk probability that the total loss incurred during a finite horizon exceed a loss level. For such an optimality problem, we first establish the optimality equation, and prove that the optimal value function is a unique solution to the optimality equation. We then show the existence of an optimal policy, and develop a value iteration algorithm for computing the value function and optimal policies. We also derive the approximation of the value function and the rules of iteration. Finally, a numerical example is given to illustrate our results.

Citation: Qiuli Liu, Xiaolong Zou. A risk minimization problem for finite horizon semi-Markov decision processes with loss rates. Journal of Dynamics & Games, 2018, 5 (2) : 143-163. doi: 10.3934/jdg.2018009
References:
 [1] N. Bauerle and U. Rieder, Markov Decision Processes with Application to Finance, Universitext, Springer, Heidelberg, 2011. Google Scholar [2] M. Bouakiz and Y. Kebir, Target-level criterion in Markov decision processes, Journal of Optimization Theory and Applications, 86 (1995), 1-15. doi: 10.1007/BF02193458. Google Scholar [3] M. K. Ghosh and S. Subhamay, Non-stationary semi-Markov secision processes on a finite horizon, Stochastic Analysis and Applications, 31 (2013), 183-190. doi: 10.1080/07362994.2013.741405. Google Scholar [4] X. P. Guo and O. Hernández-Lerma, Continuous-Time Markov Decision Processes: Theory and Applications, Springer-Verlag, Berlin, 2009. Google Scholar [5] X. P. Guo and J. Yang, A new condition and approach for zero-sum stochastic games with average payoffs, Stochastic Analysis and Applications, 26 (2008), 537-561. doi: 10.1080/07362990802007095. Google Scholar [6] X. P. Guo, P. Shi and W. P. Zhu, Strong average optimality for controlled nonhomogeneous Markov chains, Stochastic Analysis and Applications, 19 (2001), 115-134. doi: 10.1081/SAP-100001186. Google Scholar [7] O. Hernández-Lerma and J. B. Lasserre, Discrete-time Markov Control Processes, Basic optimality criteria, Springer-Verlag, New York, 1996. Google Scholar [8] O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes, Springer-Verlag, New York, 1999. Google Scholar [9] Y. H. Huang and X. P. Guo, Optimal risk probability for first passage models in semi-Markov decision processes, Journal Mathematical Analysis Applications, 359 (2009), 404-420. doi: 10.1016/j.jmaa.2009.05.058. Google Scholar [10] Y. H. Huang, X. P. Guo and X. Y. Song, Performance analysis for controlled semi-Markov systems with application to maintenance, Journal of Optimization Theory and Applications, 150 (2011), 395-415. doi: 10.1007/s10957-011-9813-7. Google Scholar [11] Y. H. Huang and X. P. Guo, Finite horizon semi-Markov decision processes with application to maintenance systems, European Journal Operations Research, 212 (2011), 131-140. doi: 10.1016/j.ejor.2011.01.027. Google Scholar [12] Y. H. Huang, X. P. Guo and Z. F. Li, Minimum risk probability for finite horizon semi-Markov decision processes, Journal Mathematical Analysis Applications, 402 (2013), 378-391. doi: 10.1016/j.jmaa.2013.01.021. Google Scholar [13] Y. H. Huang and X. P. Guo, Mean-variance problems for finite horizon semi-Markov decision processes, Applications Mathematical Optimization, 72 (2015), 233-259. doi: 10.1007/s00245-014-9278-9. Google Scholar [14] N. Limnios and G. Oprisan, Semi-Markov Processes and Reliability, Birkhäuser Boston, Inc., Boston, MA, 2001. Google Scholar [15] J. Y. Liu and S. M. Huang, Markov decision processes with distribution function criterion of first-passage time, Applications Mathematical Optimization, 43 (2001), 187-201. doi: 10.1007/s00245-001-0007-9. Google Scholar [16] P. M. Madhani, Rebalancing fixed and variable pay in a sales organization: A business cycle perspective, Compensation Benefits Review, 42 (2010), 179-189. doi: 10.1177/0886368709359668. Google Scholar [17] J. W. Mamer, Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation, Oper. Res., 34 (1986), 638-644. doi: 10.1287/opre.34.4.638. Google Scholar [18] Y. Ohtsubo, Minimizing risk models in stochastic shortest path problems, Mathematical Methods of Operations Research, 57 (2003), 79-88. doi: 10.1007/s001860200246. Google Scholar [19] Y. Ohtsubo, Optimal threshold probability in undiscounted Markov decision processes with a target set, Appl. Math. Comput., 149 (2004), 519-532. doi: 10.1016/S0096-3003(03)00158-9. Google Scholar [20] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, Inc., New York, 1994. Google Scholar [21] C. Ruhm, Are recessions good for your health?, Quarterly Journal of Economics, 115 (2000), 617-650. doi: 10.3386/w5570. Google Scholar [22] M. Sakaguchi and Y. Ohtsubo, Markov decision processes associated with two threshold probability criteria, Journal Control Theory Applications, 11 (2013), 548-557. doi: 10.1007/s11768-013-2194-8. Google Scholar [23] Q. D. Wei and X. P. Guo, New average optimality conditions for semi-Markov decision processes in Borel spaces, Journal of Optimization Theory and Applications, 153 (2012), 709-732. doi: 10.1007/s10957-012-9986-8. Google Scholar [24] D. J. White, Minimising a threshold probability in discounted Markov decision processes, J. Math. Anal. Appl., 173 (1993), 634-646. doi: 10.1006/jmaa.1993.1093. Google Scholar [25] Y. H. Wu, Bounds for the ruin probability under a Markovian modulated risk model, Communications in statistics Stochastic Models, 15 (1999), 125-136. doi: 10.1080/15326349908807529. Google Scholar [26] S. X. Yu, Y. L. Lin and P. F. Yan, Optimization models for the first arrival target distribution function in discrete time, J. Math. Anal. Appl., 225 (1998), 193-223. doi: 10.1006/jmaa.1998.6015. Google Scholar

show all references

References:
 [1] N. Bauerle and U. Rieder, Markov Decision Processes with Application to Finance, Universitext, Springer, Heidelberg, 2011. Google Scholar [2] M. Bouakiz and Y. Kebir, Target-level criterion in Markov decision processes, Journal of Optimization Theory and Applications, 86 (1995), 1-15. doi: 10.1007/BF02193458. Google Scholar [3] M. K. Ghosh and S. Subhamay, Non-stationary semi-Markov secision processes on a finite horizon, Stochastic Analysis and Applications, 31 (2013), 183-190. doi: 10.1080/07362994.2013.741405. Google Scholar [4] X. P. Guo and O. Hernández-Lerma, Continuous-Time Markov Decision Processes: Theory and Applications, Springer-Verlag, Berlin, 2009. Google Scholar [5] X. P. Guo and J. Yang, A new condition and approach for zero-sum stochastic games with average payoffs, Stochastic Analysis and Applications, 26 (2008), 537-561. doi: 10.1080/07362990802007095. Google Scholar [6] X. P. Guo, P. Shi and W. P. Zhu, Strong average optimality for controlled nonhomogeneous Markov chains, Stochastic Analysis and Applications, 19 (2001), 115-134. doi: 10.1081/SAP-100001186. Google Scholar [7] O. Hernández-Lerma and J. B. Lasserre, Discrete-time Markov Control Processes, Basic optimality criteria, Springer-Verlag, New York, 1996. Google Scholar [8] O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes, Springer-Verlag, New York, 1999. Google Scholar [9] Y. H. Huang and X. P. Guo, Optimal risk probability for first passage models in semi-Markov decision processes, Journal Mathematical Analysis Applications, 359 (2009), 404-420. doi: 10.1016/j.jmaa.2009.05.058. Google Scholar [10] Y. H. Huang, X. P. Guo and X. Y. Song, Performance analysis for controlled semi-Markov systems with application to maintenance, Journal of Optimization Theory and Applications, 150 (2011), 395-415. doi: 10.1007/s10957-011-9813-7. Google Scholar [11] Y. H. Huang and X. P. Guo, Finite horizon semi-Markov decision processes with application to maintenance systems, European Journal Operations Research, 212 (2011), 131-140. doi: 10.1016/j.ejor.2011.01.027. Google Scholar [12] Y. H. Huang, X. P. Guo and Z. F. Li, Minimum risk probability for finite horizon semi-Markov decision processes, Journal Mathematical Analysis Applications, 402 (2013), 378-391. doi: 10.1016/j.jmaa.2013.01.021. Google Scholar [13] Y. H. Huang and X. P. Guo, Mean-variance problems for finite horizon semi-Markov decision processes, Applications Mathematical Optimization, 72 (2015), 233-259. doi: 10.1007/s00245-014-9278-9. Google Scholar [14] N. Limnios and G. Oprisan, Semi-Markov Processes and Reliability, Birkhäuser Boston, Inc., Boston, MA, 2001. Google Scholar [15] J. Y. Liu and S. M. Huang, Markov decision processes with distribution function criterion of first-passage time, Applications Mathematical Optimization, 43 (2001), 187-201. doi: 10.1007/s00245-001-0007-9. Google Scholar [16] P. M. Madhani, Rebalancing fixed and variable pay in a sales organization: A business cycle perspective, Compensation Benefits Review, 42 (2010), 179-189. doi: 10.1177/0886368709359668. Google Scholar [17] J. W. Mamer, Successive approximations for finite horizon semi-Markov decision processes with application to asset liquidation, Oper. Res., 34 (1986), 638-644. doi: 10.1287/opre.34.4.638. Google Scholar [18] Y. Ohtsubo, Minimizing risk models in stochastic shortest path problems, Mathematical Methods of Operations Research, 57 (2003), 79-88. doi: 10.1007/s001860200246. Google Scholar [19] Y. Ohtsubo, Optimal threshold probability in undiscounted Markov decision processes with a target set, Appl. Math. Comput., 149 (2004), 519-532. doi: 10.1016/S0096-3003(03)00158-9. Google Scholar [20] M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, John Wiley & Sons, Inc., New York, 1994. Google Scholar [21] C. Ruhm, Are recessions good for your health?, Quarterly Journal of Economics, 115 (2000), 617-650. doi: 10.3386/w5570. Google Scholar [22] M. Sakaguchi and Y. Ohtsubo, Markov decision processes associated with two threshold probability criteria, Journal Control Theory Applications, 11 (2013), 548-557. doi: 10.1007/s11768-013-2194-8. Google Scholar [23] Q. D. Wei and X. P. Guo, New average optimality conditions for semi-Markov decision processes in Borel spaces, Journal of Optimization Theory and Applications, 153 (2012), 709-732. doi: 10.1007/s10957-012-9986-8. Google Scholar [24] D. J. White, Minimising a threshold probability in discounted Markov decision processes, J. Math. Anal. Appl., 173 (1993), 634-646. doi: 10.1006/jmaa.1993.1093. Google Scholar [25] Y. H. Wu, Bounds for the ruin probability under a Markovian modulated risk model, Communications in statistics Stochastic Models, 15 (1999), 125-136. doi: 10.1080/15326349908807529. Google Scholar [26] S. X. Yu, Y. L. Lin and P. F. Yan, Optimization models for the first arrival target distribution function in discrete time, J. Math. Anal. Appl., 225 (1998), 193-223. doi: 10.1006/jmaa.1998.6015. Google Scholar
The function $F^*_{(n_0+1)k}(1, t, \lambda)$
The function $F^*_{(n_0+1)k}(2, t, \lambda)$
The function $H^aF^*_{(n_0+1)k-1}(i, 10, \lambda)$
The function $H^aF^*_{(n_0+1)k-1}(i, 15, \lambda)$
The function $\lambda^*(i, t)$
 [1] Mathias Staudigl. A limit theorem for Markov decision processes. Journal of Dynamics & Games, 2014, 1 (4) : 639-659. doi: 10.3934/jdg.2014.1.639 [2] A. Mittal, N. Hemachandra. Learning algorithms for finite horizon constrained Markov decision processes. Journal of Industrial & Management Optimization, 2007, 3 (3) : 429-444. doi: 10.3934/jimo.2007.3.429 [3] Gábor Horváth, Zsolt Saffer, Miklós Telek. Queue length analysis of a Markov-modulated vacation queue with dependent arrival and service processes and exhaustive service policy. Journal of Industrial & Management Optimization, 2017, 13 (3) : 1365-1381. doi: 10.3934/jimo.2016077 [4] Linyi Qian, Wei Wang, Rongming Wang. Risk-minimizing portfolio selection for insurance payment processes under a Markov-modulated model. Journal of Industrial & Management Optimization, 2013, 9 (2) : 411-429. doi: 10.3934/jimo.2013.9.411 [5] Lin Xu, Rongming Wang. Upper bounds for ruin probabilities in an autoregressive risk model with a Markov chain interest rate. Journal of Industrial & Management Optimization, 2006, 2 (2) : 165-175. doi: 10.3934/jimo.2006.2.165 [6] Jiaqin Wei, Zhuo Jin, Hailiang Yang. Optimal dividend policy with liability constraint under a hidden Markov regime-switching model. Journal of Industrial & Management Optimization, 2019, 15 (4) : 1965-1993. doi: 10.3934/jimo.2018132 [7] Ming Yan, Hongtao Yang, Lei Zhang, Shuhua Zhang. Optimal investment-reinsurance policy with regime switching and value-at-risk constraint. Journal of Industrial & Management Optimization, 2017, 13 (5) : 1-17. doi: 10.3934/jimo.2019050 [8] Vincent Renault, Michèle Thieullen, Emmanuel Trélat. Optimal control of infinite-dimensional piecewise deterministic Markov processes and application to the control of neuronal dynamics via Optogenetics. Networks & Heterogeneous Media, 2017, 12 (3) : 417-459. doi: 10.3934/nhm.2017019 [9] Wael Bahsoun, Paweł Góra. SRB measures for certain Markov processes. Discrete & Continuous Dynamical Systems - A, 2011, 30 (1) : 17-37. doi: 10.3934/dcds.2011.30.17 [10] Artur Stephan, Holger Stephan. Memory equations as reduced Markov processes. Discrete & Continuous Dynamical Systems - A, 2019, 39 (4) : 2133-2155. doi: 10.3934/dcds.2019089 [11] Zhimin Zhang. On a risk model with randomized dividend-decision times. Journal of Industrial & Management Optimization, 2014, 10 (4) : 1041-1058. doi: 10.3934/jimo.2014.10.1041 [12] Yinghui Dong, Guojing Wang. Ruin probability for renewal risk model with negative risk sums. Journal of Industrial & Management Optimization, 2006, 2 (2) : 229-236. doi: 10.3934/jimo.2006.2.229 [13] Kebing Chen, Tiaojun Xiao. Reordering policy and coordination of a supply chain with a loss-averse retailer. Journal of Industrial & Management Optimization, 2013, 9 (4) : 827-853. doi: 10.3934/jimo.2013.9.827 [14] H.Thomas Banks, Shuhua Hu. Nonlinear stochastic Markov processes and modeling uncertainty in populations. Mathematical Biosciences & Engineering, 2012, 9 (1) : 1-25. doi: 10.3934/mbe.2012.9.1 [15] Xian Chen, Zhi-Ming Ma. A transformation of Markov jump processes and applications in genetic study. Discrete & Continuous Dynamical Systems - A, 2014, 34 (12) : 5061-5084. doi: 10.3934/dcds.2014.34.5061 [16] A. M. Vershik. Polymorphisms, Markov processes, quasi-similarity. Discrete & Continuous Dynamical Systems - A, 2005, 13 (5) : 1305-1324. doi: 10.3934/dcds.2005.13.1305 [17] Emilija Bernackaitė, Jonas Šiaulys. The finite-time ruin probability for an inhomogeneous renewal risk model. Journal of Industrial & Management Optimization, 2017, 13 (1) : 207-222. doi: 10.3934/jimo.2016012 [18] M. A. Efendiev. On the compactness of the stable set for rate independent processes. Communications on Pure & Applied Analysis, 2003, 2 (4) : 495-509. doi: 10.3934/cpaa.2003.2.495 [19] T. J. Sullivan, M. Koslowski, F. Theil, Michael Ortiz. Thermalization of rate-independent processes by entropic regularization. Discrete & Continuous Dynamical Systems - S, 2013, 6 (1) : 215-233. doi: 10.3934/dcdss.2013.6.215 [20] Yang Yang, Kaiyong Wang, Jiajun Liu, Zhimin Zhang. Asymptotics for a bidimensional risk model with two geometric Lévy price processes. Journal of Industrial & Management Optimization, 2019, 15 (2) : 481-505. doi: 10.3934/jimo.2018053

Impact Factor:

Metrics

• PDF downloads (33)
• HTML views (306)
• Cited by (0)

Other articlesby authors

• on AIMS
• on Google Scholar

[Back to Top]