July  2007, 3(3): 429-444. doi: 10.3934/jimo.2007.3.429

Learning algorithms for finite horizon constrained Markov decision processes

1. 

IE and OR Interdisciplinary Programme, IIT Bombay, Mumbai, 400 076, India, India

Received  April 2006 Revised  November 2006 Published  July 2007

We propose a heuristic and two stochastic approximation based learning algorithms for finite horizon, finite state-action constrained Markov decision models. We include models and numerical examples arising from risk management in fund allocation, retailer-depot product availability in a supply chain and admission control in a simple queue, that have to satisfy performance based constraints.
Citation: A. Mittal, N. Hemachandra. Learning algorithms for finite horizon constrained Markov decision processes. Journal of Industrial & Management Optimization, 2007, 3 (3) : 429-444. doi: 10.3934/jimo.2007.3.429
[1]

Mathias Staudigl. A limit theorem for Markov decision processes. Journal of Dynamics & Games, 2014, 1 (4) : 639-659. doi: 10.3934/jdg.2014.1.639

[2]

Qiuli Liu, Xiaolong Zou. A risk minimization problem for finite horizon semi-Markov decision processes with loss rates. Journal of Dynamics & Games, 2018, 5 (2) : 143-163. doi: 10.3934/jdg.2018009

[3]

K. F. Cedric Yiu, S. Y. Wang, K. L. Mak. Optimal portfolios under a value-at-risk constraint with applications to inventory control in supply chains. Journal of Industrial & Management Optimization, 2008, 4 (1) : 81-94. doi: 10.3934/jimo.2008.4.81

[4]

K. F. C. Yiu, L. L. Xie, K. L. Mak. Analysis of bullwhip effect in supply chains with heterogeneous decision models. Journal of Industrial & Management Optimization, 2009, 5 (1) : 81-94. doi: 10.3934/jimo.2009.5.81

[5]

Linyi Qian, Wei Wang, Rongming Wang. Risk-minimizing portfolio selection for insurance payment processes under a Markov-modulated model. Journal of Industrial & Management Optimization, 2013, 9 (2) : 411-429. doi: 10.3934/jimo.2013.9.411

[6]

Salah Eddine Choutri, Boualem Djehiche, Hamidou Tembine. Optimal control and zero-sum games for Markov chains of mean-field type. Mathematical Control & Related Fields, 2019, 9 (3) : 571-605. doi: 10.3934/mcrf.2019026

[7]

Ciro D'Apice, Peter I. Kogut, Rosanna Manzo. On relaxation of state constrained optimal control problem for a PDE-ODE model of supply chains. Networks & Heterogeneous Media, 2014, 9 (3) : 501-518. doi: 10.3934/nhm.2014.9.501

[8]

Vincent Renault, Michèle Thieullen, Emmanuel Trélat. Optimal control of infinite-dimensional piecewise deterministic Markov processes and application to the control of neuronal dynamics via Optogenetics. Networks & Heterogeneous Media, 2017, 12 (3) : 417-459. doi: 10.3934/nhm.2017019

[9]

Ciro D'Apice, Rosanna Manzo. A fluid dynamic model for supply chains. Networks & Heterogeneous Media, 2006, 1 (3) : 379-398. doi: 10.3934/nhm.2006.1.379

[10]

Andrew P. Sage. Risk in system of systems engineering and management. Journal of Industrial & Management Optimization, 2008, 4 (3) : 477-487. doi: 10.3934/jimo.2008.4.477

[11]

Wael Bahsoun, Paweł Góra. SRB measures for certain Markov processes. Discrete & Continuous Dynamical Systems - A, 2011, 30 (1) : 17-37. doi: 10.3934/dcds.2011.30.17

[12]

Artur Stephan, Holger Stephan. Memory equations as reduced Markov processes. Discrete & Continuous Dynamical Systems - A, 2019, 39 (4) : 2133-2155. doi: 10.3934/dcds.2019089

[13]

Zhimin Zhang. On a risk model with randomized dividend-decision times. Journal of Industrial & Management Optimization, 2014, 10 (4) : 1041-1058. doi: 10.3934/jimo.2014.10.1041

[14]

Yakov Pesin. On the work of Sarig on countable Markov chains and thermodynamic formalism. Journal of Modern Dynamics, 2014, 8 (1) : 1-14. doi: 10.3934/jmd.2014.8.1

[15]

Felix X.-F. Ye, Yue Wang, Hong Qian. Stochastic dynamics: Markov chains and random transformations. Discrete & Continuous Dynamical Systems - B, 2016, 21 (7) : 2337-2361. doi: 10.3934/dcdsb.2016050

[16]

Gabriella Bretti, Ciro D’Apice, Rosanna Manzo, Benedetto Piccoli. A continuum-discrete model for supply chains dynamics. Networks & Heterogeneous Media, 2007, 2 (4) : 661-694. doi: 10.3934/nhm.2007.2.661

[17]

Lili Ding, Xinmin Liu, Yinfeng Xu. Competitive risk management for online Bahncard problem. Journal of Industrial & Management Optimization, 2010, 6 (1) : 1-14. doi: 10.3934/jimo.2010.6.1

[18]

Yeong-Cheng Liou, Siegfried Schaible, Jen-Chih Yao. Supply chain inventory management via a Stackelberg equilibrium. Journal of Industrial & Management Optimization, 2006, 2 (1) : 81-94. doi: 10.3934/jimo.2006.2.81

[19]

H.Thomas Banks, Shuhua Hu. Nonlinear stochastic Markov processes and modeling uncertainty in populations. Mathematical Biosciences & Engineering, 2012, 9 (1) : 1-25. doi: 10.3934/mbe.2012.9.1

[20]

Xian Chen, Zhi-Ming Ma. A transformation of Markov jump processes and applications in genetic study. Discrete & Continuous Dynamical Systems - A, 2014, 34 (12) : 5061-5084. doi: 10.3934/dcds.2014.34.5061

2018 Impact Factor: 1.025

Metrics

  • PDF downloads (9)
  • HTML views (0)
  • Cited by (1)

Other articles
by authors

[Back to Top]