-
Previous Article
Evolution and jump in a Walrasian framework
- JDG Home
- This Issue
-
Next Article
On the evolution of compliance and regulation with tax evading agents
A perturbation approach to a class of discounted approximate value iteration algorithms with borel spaces
1. | Departamento de Matemáticas, Universidad de Sonora, Rosales s/n, Col. Centro, 83000, Hermosillo, Sonora, Mexico, Mexico |
References:
[1] |
A. Almudevar, Approximate fixed point iteration with an application to infinite horizon Markov decision processes,, SIAM Journal on Control and Optimization, 47 (2008), 2303.
doi: 10.1137/S0363012904441520. |
[2] |
E. F. Arruda, M. D. Fragoso and J. B. R. do Val, Approximate dynamic programming via direct search in the space of value function approximations,, European Journal of Operational Research, 211 (2011), 343.
doi: 10.1016/j.ejor.2010.11.019. |
[3] |
D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models,, Prentice-Hall, (1987).
|
[4] |
D. P. Bertsekas, Approximate policy iteration: A survey and some new methods,, Journal of Control Theory and Applications, 9 (2011), 310.
doi: 10.1007/s11768-011-1005-3. |
[5] |
D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming,, Athena Scientific, (1996). Google Scholar |
[6] |
L. Beutel, H. Gonska and D. Kacsó, On variation-diminishing Shoenberg operators: new quantitative statements,, in Multivariate approximation and interpolations with applications (ed. M. Gasca), 20 (2002), 9.
|
[7] |
R. A. DeVore, The Approximation of Continuous Functions by Positive Linear Operators, Lectures Notes in Mathematics 293,, Springer-Verlag, (1972).
|
[8] |
F. Dufour and T. Prieto-Rumeau, Approximation of infinite horizon discounted cost Markov decision processes,, in Optimization, (2012), 59.
doi: 10.1007/978-0-8176-8337-5_4. |
[9] |
D. P. de Farias and B. Van Roy, On the existence of fixed points for approximate value iteration and temporal difference learning,, Journal of Optimization Theory and Applications, 105 (2000), 589.
doi: 10.1023/A:1004641123405. |
[10] |
G. J. Gordon, Stable function approximation in dynamic programming,, Proceedings of the Twelfth International Conference on Machine Learning, (1995), 261.
doi: 10.1016/B978-1-55860-377-6.50040-2. |
[11] |
O. Hernández-Lerma, Adaptive Markov Control Processes,, Springer-Verlag, (1989).
doi: 10.1007/978-1-4419-8714-3. |
[12] |
O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes. Basic Optimality Criteria,, Springer-Verlag, (1996).
doi: 10.1007/978-1-4612-0729-0. |
[13] |
O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes,, Springer-Verlag, (1999).
doi: 10.1007/978-1-4612-0561-6. |
[14] |
D. R. Jiang and W. B. Powel, An approximate dynamic programming algorithm for monotone value functions,, Operations Research, 63 (2015), 1489.
doi: 10.1287/opre.2015.1425. |
[15] |
R. Munos, Performance bounds in $L_p$ -norm for approximate value iteration,, SIAM Journal on Control and Optimization, 46 (2007), 541.
doi: 10.1137/040614384. |
[16] |
W. P. Powell, Approximate Dynamic Programming. Solving the Curse of Dimensionality,, John Wiley & Sons Inc., (2007).
doi: 10.1002/9780470182963. |
[17] |
W. P. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration alogrithms for multidimensional continuous applications,, Journal of Control Theory and Applications, 9 (2011), 336.
doi: 10.1007/s11768-011-0313-y. |
[18] |
W. P. Powell, Perspectives of approximate dynamic programming,, Annals of Operations Research, 241 (2016), 319.
doi: 10.1007/s10479-012-1077-6. |
[19] |
M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming,, Wiley, (1994).
|
[20] |
J. Rust, Numerical dynamic programming in economics,, in Handbook of Computational Economics, 1 (1996), 619.
|
[21] |
J. Stachurski, Continuous state dynamic programming via nonexpansive approximation,, Computational Economics, 31 (2008), 141.
doi: 10.1007/s10614-007-9111-5. |
[22] |
J. Stachurski, Dynamic Economic: Theory and Computation,, MIT Press, (2009).
|
[23] |
S. Stidham Jr. and R. Weber, A survey of Markov decision models for control of networks of queues,, Queueing Systems, 13 (1993), 291.
doi: 10.1007/BF01158935. |
[24] |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction,, MIT Press, (1998). Google Scholar |
[25] |
B. Van Roy, Performance loss bounds for approximate value iteration with state space aggregation,, Mathematics of Operations Research, 31 (2006), 234.
doi: 10.1287/moor.1060.0188. |
[26] |
O. Vega-Amaya and R. Montes-de-Oca, Application of average dynamic programming to inventory systems,, Mathematical Methods of Operations Research, 47 (1998), 451.
doi: 10.1007/BF01198405. |
[27] |
D. J. White, Real applications of Markov decision processes,, Interfaces, 15 (1985), 73.
doi: 10.1287/inte.15.6.73. |
[28] |
D. J. White, Further real applications of Markov decision processes,, Interfaces, 18 (1988), 55.
doi: 10.1287/inte.18.5.55. |
[29] |
D. J. White, A survey of applications of Markov decision processes,, The Journal of the Operational Research Society, 44 (1993), 1073. Google Scholar |
[30] |
S. Yakowitz, Dynamic programming applications in water resources,, Water Resource Research, 18 (1982), 673.
doi: 10.1029/WR018i004p00673. |
[31] |
W. W.-G. Yeh, Reservoir management and operation models: A state-of-the-art review,, Water Resource Research, 21 (1985), 1797.
doi: 10.1029/WR021i012p01797. |
show all references
References:
[1] |
A. Almudevar, Approximate fixed point iteration with an application to infinite horizon Markov decision processes,, SIAM Journal on Control and Optimization, 47 (2008), 2303.
doi: 10.1137/S0363012904441520. |
[2] |
E. F. Arruda, M. D. Fragoso and J. B. R. do Val, Approximate dynamic programming via direct search in the space of value function approximations,, European Journal of Operational Research, 211 (2011), 343.
doi: 10.1016/j.ejor.2010.11.019. |
[3] |
D. P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models,, Prentice-Hall, (1987).
|
[4] |
D. P. Bertsekas, Approximate policy iteration: A survey and some new methods,, Journal of Control Theory and Applications, 9 (2011), 310.
doi: 10.1007/s11768-011-1005-3. |
[5] |
D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming,, Athena Scientific, (1996). Google Scholar |
[6] |
L. Beutel, H. Gonska and D. Kacsó, On variation-diminishing Shoenberg operators: new quantitative statements,, in Multivariate approximation and interpolations with applications (ed. M. Gasca), 20 (2002), 9.
|
[7] |
R. A. DeVore, The Approximation of Continuous Functions by Positive Linear Operators, Lectures Notes in Mathematics 293,, Springer-Verlag, (1972).
|
[8] |
F. Dufour and T. Prieto-Rumeau, Approximation of infinite horizon discounted cost Markov decision processes,, in Optimization, (2012), 59.
doi: 10.1007/978-0-8176-8337-5_4. |
[9] |
D. P. de Farias and B. Van Roy, On the existence of fixed points for approximate value iteration and temporal difference learning,, Journal of Optimization Theory and Applications, 105 (2000), 589.
doi: 10.1023/A:1004641123405. |
[10] |
G. J. Gordon, Stable function approximation in dynamic programming,, Proceedings of the Twelfth International Conference on Machine Learning, (1995), 261.
doi: 10.1016/B978-1-55860-377-6.50040-2. |
[11] |
O. Hernández-Lerma, Adaptive Markov Control Processes,, Springer-Verlag, (1989).
doi: 10.1007/978-1-4419-8714-3. |
[12] |
O. Hernández-Lerma and J. B. Lasserre, Discrete-Time Markov Control Processes. Basic Optimality Criteria,, Springer-Verlag, (1996).
doi: 10.1007/978-1-4612-0729-0. |
[13] |
O. Hernández-Lerma and J. B. Lasserre, Further Topics on Discrete-Time Markov Control Processes,, Springer-Verlag, (1999).
doi: 10.1007/978-1-4612-0561-6. |
[14] |
D. R. Jiang and W. B. Powel, An approximate dynamic programming algorithm for monotone value functions,, Operations Research, 63 (2015), 1489.
doi: 10.1287/opre.2015.1425. |
[15] |
R. Munos, Performance bounds in $L_p$ -norm for approximate value iteration,, SIAM Journal on Control and Optimization, 46 (2007), 541.
doi: 10.1137/040614384. |
[16] |
W. P. Powell, Approximate Dynamic Programming. Solving the Curse of Dimensionality,, John Wiley & Sons Inc., (2007).
doi: 10.1002/9780470182963. |
[17] |
W. P. Powell and J. Ma, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration alogrithms for multidimensional continuous applications,, Journal of Control Theory and Applications, 9 (2011), 336.
doi: 10.1007/s11768-011-0313-y. |
[18] |
W. P. Powell, Perspectives of approximate dynamic programming,, Annals of Operations Research, 241 (2016), 319.
doi: 10.1007/s10479-012-1077-6. |
[19] |
M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming,, Wiley, (1994).
|
[20] |
J. Rust, Numerical dynamic programming in economics,, in Handbook of Computational Economics, 1 (1996), 619.
|
[21] |
J. Stachurski, Continuous state dynamic programming via nonexpansive approximation,, Computational Economics, 31 (2008), 141.
doi: 10.1007/s10614-007-9111-5. |
[22] |
J. Stachurski, Dynamic Economic: Theory and Computation,, MIT Press, (2009).
|
[23] |
S. Stidham Jr. and R. Weber, A survey of Markov decision models for control of networks of queues,, Queueing Systems, 13 (1993), 291.
doi: 10.1007/BF01158935. |
[24] |
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction,, MIT Press, (1998). Google Scholar |
[25] |
B. Van Roy, Performance loss bounds for approximate value iteration with state space aggregation,, Mathematics of Operations Research, 31 (2006), 234.
doi: 10.1287/moor.1060.0188. |
[26] |
O. Vega-Amaya and R. Montes-de-Oca, Application of average dynamic programming to inventory systems,, Mathematical Methods of Operations Research, 47 (1998), 451.
doi: 10.1007/BF01198405. |
[27] |
D. J. White, Real applications of Markov decision processes,, Interfaces, 15 (1985), 73.
doi: 10.1287/inte.15.6.73. |
[28] |
D. J. White, Further real applications of Markov decision processes,, Interfaces, 18 (1988), 55.
doi: 10.1287/inte.18.5.55. |
[29] |
D. J. White, A survey of applications of Markov decision processes,, The Journal of the Operational Research Society, 44 (1993), 1073. Google Scholar |
[30] |
S. Yakowitz, Dynamic programming applications in water resources,, Water Resource Research, 18 (1982), 673.
doi: 10.1029/WR018i004p00673. |
[31] |
W. W.-G. Yeh, Reservoir management and operation models: A state-of-the-art review,, Water Resource Research, 21 (1985), 1797.
doi: 10.1029/WR021i012p01797. |
[1] |
Mathias Staudigl. A limit theorem for Markov decision processes. Journal of Dynamics & Games, 2014, 1 (4) : 639-659. doi: 10.3934/jdg.2014.1.639 |
[2] |
A. Mittal, N. Hemachandra. Learning algorithms for finite horizon constrained Markov decision processes. Journal of Industrial & Management Optimization, 2007, 3 (3) : 429-444. doi: 10.3934/jimo.2007.3.429 |
[3] |
Qiuli Liu, Xiaolong Zou. A risk minimization problem for finite horizon semi-Markov decision processes with loss rates. Journal of Dynamics & Games, 2018, 5 (2) : 143-163. doi: 10.3934/jdg.2018009 |
[4] |
Heikki Haario, Leonid Kalachev, Marko Laine. Reduction and identification of dynamic models. Simple example: Generic receptor model. Discrete & Continuous Dynamical Systems - B, 2013, 18 (2) : 417-435. doi: 10.3934/dcdsb.2013.18.417 |
[5] |
Wael Bahsoun, Paweł Góra. SRB measures for certain Markov processes. Discrete & Continuous Dynamical Systems - A, 2011, 30 (1) : 17-37. doi: 10.3934/dcds.2011.30.17 |
[6] |
Artur Stephan, Holger Stephan. Memory equations as reduced Markov processes. Discrete & Continuous Dynamical Systems - A, 2019, 39 (4) : 2133-2155. doi: 10.3934/dcds.2019089 |
[7] |
H.Thomas Banks, Shuhua Hu. Nonlinear stochastic Markov processes and modeling uncertainty in populations. Mathematical Biosciences & Engineering, 2012, 9 (1) : 1-25. doi: 10.3934/mbe.2012.9.1 |
[8] |
Xian Chen, Zhi-Ming Ma. A transformation of Markov jump processes and applications in genetic study. Discrete & Continuous Dynamical Systems - A, 2014, 34 (12) : 5061-5084. doi: 10.3934/dcds.2014.34.5061 |
[9] |
A. M. Vershik. Polymorphisms, Markov processes, quasi-similarity. Discrete & Continuous Dynamical Systems - A, 2005, 13 (5) : 1305-1324. doi: 10.3934/dcds.2005.13.1305 |
[10] |
Vladimir Gaitsgory, Alex Parkinson, Ilya Shvartsman. Linear programming based optimality conditions and approximate solution of a deterministic infinite horizon discounted optimal control problem in discrete time. Discrete & Continuous Dynamical Systems - B, 2019, 24 (4) : 1743-1767. doi: 10.3934/dcdsb.2018235 |
[11] |
Rich Stankewitz, Hiroki Sumi. Random backward iteration algorithm for Julia sets of rational semigroups. Discrete & Continuous Dynamical Systems - A, 2015, 35 (5) : 2165-2175. doi: 10.3934/dcds.2015.35.2165 |
[12] |
Scott Crass. New light on solving the sextic by iteration: An algorithm using reliable dynamics. Journal of Modern Dynamics, 2011, 5 (2) : 397-408. doi: 10.3934/jmd.2011.5.397 |
[13] |
Necdet Bildik, Sinan Deniz. New approximate solutions to the nonlinear Klein-Gordon equations using perturbation iteration techniques. Discrete & Continuous Dynamical Systems - S, 2018, 0 (0) : 503-518. doi: 10.3934/dcdss.2020028 |
[14] |
Gholam Hassan Shirdel, Somayeh Ramezani-Tarkhorani. A new method for ranking decision making units using common set of weights: A developed criterion. Journal of Industrial & Management Optimization, 2017, 13 (5) : 1-19. doi: 10.3934/jimo.2018171 |
[15] |
Evangelos Evangelou. Approximate Bayesian inference for geostatistical generalised linear models. Foundations of Data Science, 2019, 1 (1) : 39-60. doi: 10.3934/fods.2019002 |
[16] |
K. F. C. Yiu, L. L. Xie, K. L. Mak. Analysis of bullwhip effect in supply chains with heterogeneous decision models. Journal of Industrial & Management Optimization, 2009, 5 (1) : 81-94. doi: 10.3934/jimo.2009.5.81 |
[17] |
Olivier Bokanowski, Maurizio Falcone, Roberto Ferretti, Lars Grüne, Dante Kalise, Hasnaa Zidani. Value iteration convergence of $\epsilon$-monotone schemes for stationary Hamilton-Jacobi equations. Discrete & Continuous Dynamical Systems - A, 2015, 35 (9) : 4041-4070. doi: 10.3934/dcds.2015.35.4041 |
[18] |
Xiaoqin Jiang, Ying Zhang. A smoothing-type algorithm for absolute value equations. Journal of Industrial & Management Optimization, 2013, 9 (4) : 789-798. doi: 10.3934/jimo.2013.9.789 |
[19] |
Changming Song, Hong Li, Jina Li. Initial boundary value problem for the singularly perturbed Boussinesq-type equation. Conference Publications, 2013, 2013 (special) : 709-717. doi: 10.3934/proc.2013.2013.709 |
[20] |
Vladimir Kazakov. Sampling - reconstruction procedure with jitter of markov continuous processes formed by stochastic differential equations of the first order. Conference Publications, 2009, 2009 (Special) : 433-441. doi: 10.3934/proc.2009.2009.433 |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]