-
Previous Article
Evolution and jump in a Walrasian framework
- JDG Home
- This Issue
-
Next Article
On the evolution of compliance and regulation with tax evading agents
A perturbation approach to a class of discounted approximate value iteration algorithms with borel spaces
1. | Departamento de Matemáticas, Universidad de Sonora, Rosales s/n, Col. Centro, 83000, Hermosillo, Sonora, Mexico, Mexico |
References:
[1] |
A. Almudevar, Approximate fixed point iteration with an application to infinite horizon Markov decision processes, SIAM Journal on Control and Optimization, 47 (2008), 2303-2347.
doi: 10.1137/S0363012904441520. |
[2] |
E. F. Arruda, M. D. Fragoso and J. B. R. do Val, Approximate dynamic programming via direct search in the space of value function approximations, European Journal of Operational Research, 211 (2011), 343-351.
doi: 10.1016/j.ejor.2010.11.019. |
[3] |
D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987. |
[4] |
D. P. Bertsekas, Approximate policy iteration: A survey and some new methods, Journal of Control Theory and Applications, 9 (2011), 310-335.
doi: 10.1007/s11768-011-1005-3. |
[5] |
D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, MA, 1996. |
[6] |
L. Beutel, H. Gonska and D. Kacsó, On variation-diminishing Shoenberg operators: new quantitative statements, in Multivariate approximation and interpolations with applications (ed. M. Gasca), Monografías de la Academia de Ciencias de Zaragoza, 20 (2002), 9-58. |
[7] |
R. A. DeVore, The Approximation of Continuous Functions by Positive Linear Operators, Lectures Notes in Mathematics 293, Springer-Verlag, Berlin Heidelberg, 1972. |
[8] |
F. Dufour and T. Prieto-Rumeau, Approximation of infinite horizon discounted cost Markov decision processes, in Optimization, Control, and Applications of Stochastic Systems, In Honor of Onésimo Hernández-Lerma (eds. D. Hernández-Hernández and J. A. Minjárez-Sosa), Birkhäuser, (2012), 59-76.
doi: 10.1007/978-0-8176-8337-5_4. |
[9] |
D. P. de Farias and B. Van Roy, On the existence of fixed points for approximate value iteration and temporal difference learning, Journal of Optimization Theory and Applications, 105 (2000), 589-608.
doi: 10.1023/A:1004641123405. |
[10] |
G. J. Gordon, Stable function approximation in dynamic programming, Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, CA, 1995, 261-268.
doi: 10.1016/B978-1-55860-377-6.50040-2. |
[11] |
O. Hernández-Lerma, Adaptive Markov Control Processes, Springer-Verlag, NY, 1989.
doi: 10.1007/978-1-4419-8714-3. |
[12] |
O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes. Basic Optimality Criteria, Springer-Verlag, NY, 1996.
doi: 10.1007/978-1-4612-0729-0. |
[13] |
O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes, Springer-Verlag, NY, 1999.
doi: 10.1007/978-1-4612-0561-6. |
[14] |
D. R. Jiang and W. B. Powel, An approximate dynamic programming algorithm for monotone value functions, Operations Research, 63 (2015), 1489-1511.
doi: 10.1287/opre.2015.1425. |
[15] |
R. Munos, Performance bounds in $L_p$ -norm for approximate value iteration, SIAM Journal on Control and Optimization, 46 (2007), 541-561.
doi: 10.1137/040614384. |
[16] |
W. P. Powell, Approximate Dynamic Programming. Solving the Curse of Dimensionality, John Wiley & Sons Inc., 2007.
doi: 10.1002/9780470182963. |
[17] |
W. P. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration alogrithms for multidimensional continuous applications, Journal of Control Theory and Applications, 9 (2011), 336-352.
doi: 10.1007/s11768-011-0313-y. |
[18] |
W. P. Powell, Perspectives of approximate dynamic programming, Annals of Operations Research, 241 (2016), 319-356.
doi: 10.1007/s10479-012-1077-6. |
[19] |
M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, NY, 1994. |
[20] |
J. Rust, Numerical dynamic programming in economics, in Handbook of Computational Economics, (eds. H. Amman, D. Kendrick and J. Rust), Elsevier, 1 (1996), 619-729. |
[21] |
J. Stachurski, Continuous state dynamic programming via nonexpansive approximation, Computational Economics, 31 (2008), 141-160.
doi: 10.1007/s10614-007-9111-5. |
[22] |
J. Stachurski, Dynamic Economic: Theory and Computation, MIT Press, Cambridge, MA, 2009. |
[23] |
S. Stidham Jr. and R. Weber, A survey of Markov decision models for control of networks of queues, Queueing Systems, 13 (1993), 291-314.
doi: 10.1007/BF01158935. |
[24] |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998. |
[25] |
B. Van Roy, Performance loss bounds for approximate value iteration with state space aggregation, Mathematics of Operations Research, 31 (2006), 234-244.
doi: 10.1287/moor.1060.0188. |
[26] |
O. Vega-Amaya and R. Montes-de-Oca, Application of average dynamic programming to inventory systems, Mathematical Methods of Operations Research, 47 (1998), 451-471.
doi: 10.1007/BF01198405. |
[27] |
D. J. White, Real applications of Markov decision processes, Interfaces, 15 (1985), 73-83.
doi: 10.1287/inte.15.6.73. |
[28] |
D. J. White, Further real applications of Markov decision processes, Interfaces, 18 (1988), 55-61.
doi: 10.1287/inte.18.5.55. |
[29] |
D. J. White, A survey of applications of Markov decision processes, The Journal of the Operational Research Society, 44 (1993), 1073-1096. |
[30] |
S. Yakowitz, Dynamic programming applications in water resources, Water Resource Research, 18 (1982), 673-696.
doi: 10.1029/WR018i004p00673. |
[31] |
W. W.-G. Yeh, Reservoir management and operation models: A state-of-the-art review, Water Resource Research, 21 (1985), 1797-1818.
doi: 10.1029/WR021i012p01797. |
show all references
References:
[1] |
A. Almudevar, Approximate fixed point iteration with an application to infinite horizon Markov decision processes, SIAM Journal on Control and Optimization, 47 (2008), 2303-2347.
doi: 10.1137/S0363012904441520. |
[2] |
E. F. Arruda, M. D. Fragoso and J. B. R. do Val, Approximate dynamic programming via direct search in the space of value function approximations, European Journal of Operational Research, 211 (2011), 343-351.
doi: 10.1016/j.ejor.2010.11.019. |
[3] |
D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models, Prentice-Hall, Englewood Cliffs, NJ, 1987. |
[4] |
D. P. Bertsekas, Approximate policy iteration: A survey and some new methods, Journal of Control Theory and Applications, 9 (2011), 310-335.
doi: 10.1007/s11768-011-1005-3. |
[5] |
D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, Belmont, MA, 1996. |
[6] |
L. Beutel, H. Gonska and D. Kacsó, On variation-diminishing Shoenberg operators: new quantitative statements, in Multivariate approximation and interpolations with applications (ed. M. Gasca), Monografías de la Academia de Ciencias de Zaragoza, 20 (2002), 9-58. |
[7] |
R. A. DeVore, The Approximation of Continuous Functions by Positive Linear Operators, Lectures Notes in Mathematics 293, Springer-Verlag, Berlin Heidelberg, 1972. |
[8] |
F. Dufour and T. Prieto-Rumeau, Approximation of infinite horizon discounted cost Markov decision processes, in Optimization, Control, and Applications of Stochastic Systems, In Honor of Onésimo Hernández-Lerma (eds. D. Hernández-Hernández and J. A. Minjárez-Sosa), Birkhäuser, (2012), 59-76.
doi: 10.1007/978-0-8176-8337-5_4. |
[9] |
D. P. de Farias and B. Van Roy, On the existence of fixed points for approximate value iteration and temporal difference learning, Journal of Optimization Theory and Applications, 105 (2000), 589-608.
doi: 10.1023/A:1004641123405. |
[10] |
G. J. Gordon, Stable function approximation in dynamic programming, Proceedings of the Twelfth International Conference on Machine Learning, Tahoe City, CA, 1995, 261-268.
doi: 10.1016/B978-1-55860-377-6.50040-2. |
[11] |
O. Hernández-Lerma, Adaptive Markov Control Processes, Springer-Verlag, NY, 1989.
doi: 10.1007/978-1-4419-8714-3. |
[12] |
O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes. Basic Optimality Criteria, Springer-Verlag, NY, 1996.
doi: 10.1007/978-1-4612-0729-0. |
[13] |
O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes, Springer-Verlag, NY, 1999.
doi: 10.1007/978-1-4612-0561-6. |
[14] |
D. R. Jiang and W. B. Powel, An approximate dynamic programming algorithm for monotone value functions, Operations Research, 63 (2015), 1489-1511.
doi: 10.1287/opre.2015.1425. |
[15] |
R. Munos, Performance bounds in $L_p$ -norm for approximate value iteration, SIAM Journal on Control and Optimization, 46 (2007), 541-561.
doi: 10.1137/040614384. |
[16] |
W. P. Powell, Approximate Dynamic Programming. Solving the Curse of Dimensionality, John Wiley & Sons Inc., 2007.
doi: 10.1002/9780470182963. |
[17] |
W. P. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration alogrithms for multidimensional continuous applications, Journal of Control Theory and Applications, 9 (2011), 336-352.
doi: 10.1007/s11768-011-0313-y. |
[18] |
W. P. Powell, Perspectives of approximate dynamic programming, Annals of Operations Research, 241 (2016), 319-356.
doi: 10.1007/s10479-012-1077-6. |
[19] |
M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley, NY, 1994. |
[20] |
J. Rust, Numerical dynamic programming in economics, in Handbook of Computational Economics, (eds. H. Amman, D. Kendrick and J. Rust), Elsevier, 1 (1996), 619-729. |
[21] |
J. Stachurski, Continuous state dynamic programming via nonexpansive approximation, Computational Economics, 31 (2008), 141-160.
doi: 10.1007/s10614-007-9111-5. |
[22] |
J. Stachurski, Dynamic Economic: Theory and Computation, MIT Press, Cambridge, MA, 2009. |
[23] |
S. Stidham Jr. and R. Weber, A survey of Markov decision models for control of networks of queues, Queueing Systems, 13 (1993), 291-314.
doi: 10.1007/BF01158935. |
[24] |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998. |
[25] |
B. Van Roy, Performance loss bounds for approximate value iteration with state space aggregation, Mathematics of Operations Research, 31 (2006), 234-244.
doi: 10.1287/moor.1060.0188. |
[26] |
O. Vega-Amaya and R. Montes-de-Oca, Application of average dynamic programming to inventory systems, Mathematical Methods of Operations Research, 47 (1998), 451-471.
doi: 10.1007/BF01198405. |
[27] |
D. J. White, Real applications of Markov decision processes, Interfaces, 15 (1985), 73-83.
doi: 10.1287/inte.15.6.73. |
[28] |
D. J. White, Further real applications of Markov decision processes, Interfaces, 18 (1988), 55-61.
doi: 10.1287/inte.18.5.55. |
[29] |
D. J. White, A survey of applications of Markov decision processes, The Journal of the Operational Research Society, 44 (1993), 1073-1096. |
[30] |
S. Yakowitz, Dynamic programming applications in water resources, Water Resource Research, 18 (1982), 673-696.
doi: 10.1029/WR018i004p00673. |
[31] |
W. W.-G. Yeh, Reservoir management and operation models: A state-of-the-art review, Water Resource Research, 21 (1985), 1797-1818.
doi: 10.1029/WR021i012p01797. |
[1] |
Mathias Staudigl. A limit theorem for Markov decision processes. Journal of Dynamics and Games, 2014, 1 (4) : 639-659. doi: 10.3934/jdg.2014.1.639 |
[2] |
A. Mittal, N. Hemachandra. Learning algorithms for finite horizon constrained Markov decision processes. Journal of Industrial and Management Optimization, 2007, 3 (3) : 429-444. doi: 10.3934/jimo.2007.3.429 |
[3] |
Qiuli Liu, Xiaolong Zou. A risk minimization problem for finite horizon semi-Markov decision processes with loss rates. Journal of Dynamics and Games, 2018, 5 (2) : 143-163. doi: 10.3934/jdg.2018009 |
[4] |
Heikki Haario, Leonid Kalachev, Marko Laine. Reduction and identification of dynamic models. Simple example: Generic receptor model. Discrete and Continuous Dynamical Systems - B, 2013, 18 (2) : 417-435. doi: 10.3934/dcdsb.2013.18.417 |
[5] |
Wael Bahsoun, Paweł Góra. SRB measures for certain Markov processes. Discrete and Continuous Dynamical Systems, 2011, 30 (1) : 17-37. doi: 10.3934/dcds.2011.30.17 |
[6] |
Artur Stephan, Holger Stephan. Memory equations as reduced Markov processes. Discrete and Continuous Dynamical Systems, 2019, 39 (4) : 2133-2155. doi: 10.3934/dcds.2019089 |
[7] |
H.Thomas Banks, Shuhua Hu. Nonlinear stochastic Markov processes and modeling uncertainty in populations. Mathematical Biosciences & Engineering, 2012, 9 (1) : 1-25. doi: 10.3934/mbe.2012.9.1 |
[8] |
Angelica Pachon, Federico Polito, Costantino Ricciuti. On discrete-time semi-Markov processes. Discrete and Continuous Dynamical Systems - B, 2021, 26 (3) : 1499-1529. doi: 10.3934/dcdsb.2020170 |
[9] |
Thomas Kruse, Mikhail Urusov. Approximating exit times of continuous Markov processes. Discrete and Continuous Dynamical Systems - B, 2020, 25 (9) : 3631-3650. doi: 10.3934/dcdsb.2020076 |
[10] |
Xian Chen, Zhi-Ming Ma. A transformation of Markov jump processes and applications in genetic study. Discrete and Continuous Dynamical Systems, 2014, 34 (12) : 5061-5084. doi: 10.3934/dcds.2014.34.5061 |
[11] |
A. M. Vershik. Polymorphisms, Markov processes, quasi-similarity. Discrete and Continuous Dynamical Systems, 2005, 13 (5) : 1305-1324. doi: 10.3934/dcds.2005.13.1305 |
[12] |
Rich Stankewitz, Hiroki Sumi. Random backward iteration algorithm for Julia sets of rational semigroups. Discrete and Continuous Dynamical Systems, 2015, 35 (5) : 2165-2175. doi: 10.3934/dcds.2015.35.2165 |
[13] |
Scott Crass. New light on solving the sextic by iteration: An algorithm using reliable dynamics. Journal of Modern Dynamics, 2011, 5 (2) : 397-408. doi: 10.3934/jmd.2011.5.397 |
[14] |
Vladimir Gaitsgory, Alex Parkinson, Ilya Shvartsman. Linear programming based optimality conditions and approximate solution of a deterministic infinite horizon discounted optimal control problem in discrete time. Discrete and Continuous Dynamical Systems - B, 2019, 24 (4) : 1743-1767. doi: 10.3934/dcdsb.2018235 |
[15] |
Chandan Pal, Somnath Pradhan. Zero-sum games for pure jump processes with risk-sensitive discounted cost criteria. Journal of Dynamics and Games, 2022, 9 (1) : 13-25. doi: 10.3934/jdg.2021020 |
[16] |
Necdet Bildik, Sinan Deniz. New approximate solutions to the nonlinear Klein-Gordon equations using perturbation iteration techniques. Discrete and Continuous Dynamical Systems - S, 2020, 13 (3) : 503-518. doi: 10.3934/dcdss.2020028 |
[17] |
Zongwei Chen. An online-decision algorithm for the multi-period bank clearing problem. Journal of Industrial and Management Optimization, 2022, 18 (4) : 2783-2803. doi: 10.3934/jimo.2021091 |
[18] |
Gholam Hassan Shirdel, Somayeh Ramezani-Tarkhorani. A new method for ranking decision making units using common set of weights: A developed criterion. Journal of Industrial and Management Optimization, 2020, 16 (2) : 633-651. doi: 10.3934/jimo.2018171 |
[19] |
Evangelos Evangelou. Approximate Bayesian inference for geostatistical generalised linear models. Foundations of Data Science, 2019, 1 (1) : 39-60. doi: 10.3934/fods.2019002 |
[20] |
Samuel N. Cohen. Uncertainty and filtering of hidden Markov models in discrete time. Probability, Uncertainty and Quantitative Risk, 2020, 5 (0) : 4-. doi: 10.1186/s41546-020-00046-x |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]