- Previous Article
- JDG Home
- This Issue
-
Next Article
Payoff performance of fictitious play
A limit theorem for Markov decision processes
1. | Center for Mathematical Economics, University Bielefeld, Postfach 100131, 33501 Bielefeld, Germany |
References:
[1] |
R. J. Aumann and M. Maschler, Repeated Games with Incomplete Information, MIT Press, Cambridge, MA, 1995. |
[2] |
M. Bardi and I. Capuzzo-Dolcetta, Optimal control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Birkhäuser - Systems & Control: Foundations & Applications, 1997.
doi: 10.1007/978-0-8176-4755-1. |
[3] |
D. P. Bertsekas and S. E. Shreve, Stochastic Optimal Control: The Discrete Time Case, Academic Press, 1978. |
[4] |
P. Billingsley, Convergence of Probability Measures, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., New York, 1999.
doi: 10.1002/9780470316962. |
[5] |
I. Capuzzo-Dolcetta and H. Ishii, Approximate solutions of the Bellman equation of deterministic control theory, Applied Mathematics & Optimization, 11 (1984), 161-181.
doi: 10.1007/BF01442176. |
[6] |
P. Cardaliaguet, C. Rainer, D. Rosenberg and N. Vieille, Markov games with frequent actions and incomplete information, 2013. arXiv:1307.3365v1 [math.OC]. |
[7] |
N. El Karoui, D. Nguyen and M. Jeanblanc-Picqué, Compactification methods in the control of degenerate diffusions: Existence of an optimal control, Stochastics, 20 (1987), 169-219.
doi: 10.1080/17442508708833443. |
[8] |
S. N. Ethier and T. G. Kurtz, Markov Processes: Characterization and Convergence, Wiley, New York, 1986.
doi: 10.1002/9780470316658. |
[9] |
M. Falcone, A numerical approach to the infinite horizon problem of deterministic control theory, Applied Mathematics & Optimization, 15 (1987), 1-13.
doi: 10.1007/BF01442644. |
[10] |
W. H. Fleming, The convergence problem for differential games, Journal of Mathematical Analysis and Applications, 3 (1961), 102-116.
doi: 10.1016/0022-247X(61)90009-9. |
[11] |
N. Gast, B. Gaujal and J.-Y. Le Boudec, Mean field for markov decision processes: From discrete to continuous optimization, IEEE Transactions on Automatic Control, 57 (2012), 2266-2280.
doi: 10.1109/TAC.2012.2186176. |
[12] |
F. Gensbittel, Continuous-time Limit of Dynamic Games with Incomplete Information and a More Informed Player, hal-00910970, version 1, 2013. |
[13] |
R. Gonzalez and E. Rofman, On deterministic control problems: An approximation procedure for the optimal cost I. the stationary problem, SIAM Journal on Control and Optimization, 23 (1985), 242-266.
doi: 10.1137/0323018. |
[14] |
X. Guo and O. Hernández-Lerma, Nonzero-sum games for continuous-time markov chains with unbounded discounted payoffs, Journal of Applied Probability, 42 (2005), 303-320.
doi: 10.1239/jap/1118777172. |
[15] |
O. Hernández-Lerma and J. B. Laserre, Discrete-Time Markov Control Processes: Basic Optimality criteria, Springer-Verlag, 1996.
doi: 10.1007/978-1-4612-0729-0. |
[16] |
J. Hörner, T. Sugaya, S. Takahashi and N. Vieille, Recursive methods in discounted stochastic games: An algorithm for $\delta\rightarrow 1$ and a folk theorem, Econometrica, 79 (2011), 1277-1318.
doi: 10.3982/ECTA9004. |
[17] |
O. Kallenberg, Foundations of Modern Probability, Springer, New York [u.a.], 2nd edition, 2002.
doi: 10.1007/978-1-4757-4015-8. |
[18] |
I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus, Springer-Verlag, 2nd edition, 2000. |
[19] |
H. J. Kushner, Weak convergence Methods and Singularly Perturbed Stochastic Control and Filtering Problems, Birkhäuser - Systems & Control: Foundations & Applications, Boston, MA, 1990.
doi: 10.1007/978-1-4612-4482-0. |
[20] |
H. J. Kushner and P. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer, New York, 2nd edition, 2001.
doi: 10.1007/978-1-4613-0007-6. |
[21] |
A. Neyman, Stochastic games with short-stage duration, Dynamic Games and Applications, 3 (2013), 236-278.
doi: 10.1007/s13235-013-0083-x. |
[22] |
A. S. Nowak and T. E. S. Raghavan, Existence of stationary correlated equilibria with symmetric information for discounted stochastic games, Mathematics of Operations Research, 17 (1992), 519-526.
doi: 10.1287/moor.17.3.519. |
[23] |
W. H. Sandholm and M. Staudigl, Stochastic stability in the small noise double limit, I: Theory, Unpublished manuscript, University of Wisconsin and Bielefeld University, 2014. |
[24] |
W. H. Sandholm and M. Staudigl, Stochastic stability in the small noise double limit, II: The logit model, Unpublished manuscript, University of Wisconsin and Bielefeld University, 2014. |
[25] |
Y. Sannikov, Games with imperfectly observable actions in continuous time, Econometrica, 75 (2007), 1285-1329.
doi: 10.1111/j.1468-0262.2007.00795.x. |
[26] |
S. Soulaimani, M. Quincampoix and S. Sorin, Repeated games and qualitative differential games: Approachability and comparison of strategies, SIAM Journal on Control and Optimization, 48 (2009), 2461-2479.
doi: 10.1137/090749098. |
[27] |
M. Staudigl, Stochastic stability in asymmetric binary choice coordination games, Games and Economic Behavior, 75 (2012), 372-401.
doi: 10.1016/j.geb.2011.11.003. |
[28] |
M. Staudigl and J.-H. Steg, On repeated games with imperfect public monitoring: From discrete to continuous time, Bielefeld University, unpublished manuscript, 2014. |
[29] |
J. Warga, Optimal Control of Differential and Functional Equations, Academic Press, New York-London, 1972. |
show all references
References:
[1] |
R. J. Aumann and M. Maschler, Repeated Games with Incomplete Information, MIT Press, Cambridge, MA, 1995. |
[2] |
M. Bardi and I. Capuzzo-Dolcetta, Optimal control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Birkhäuser - Systems & Control: Foundations & Applications, 1997.
doi: 10.1007/978-0-8176-4755-1. |
[3] |
D. P. Bertsekas and S. E. Shreve, Stochastic Optimal Control: The Discrete Time Case, Academic Press, 1978. |
[4] |
P. Billingsley, Convergence of Probability Measures, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., New York, 1999.
doi: 10.1002/9780470316962. |
[5] |
I. Capuzzo-Dolcetta and H. Ishii, Approximate solutions of the Bellman equation of deterministic control theory, Applied Mathematics & Optimization, 11 (1984), 161-181.
doi: 10.1007/BF01442176. |
[6] |
P. Cardaliaguet, C. Rainer, D. Rosenberg and N. Vieille, Markov games with frequent actions and incomplete information, 2013. arXiv:1307.3365v1 [math.OC]. |
[7] |
N. El Karoui, D. Nguyen and M. Jeanblanc-Picqué, Compactification methods in the control of degenerate diffusions: Existence of an optimal control, Stochastics, 20 (1987), 169-219.
doi: 10.1080/17442508708833443. |
[8] |
S. N. Ethier and T. G. Kurtz, Markov Processes: Characterization and Convergence, Wiley, New York, 1986.
doi: 10.1002/9780470316658. |
[9] |
M. Falcone, A numerical approach to the infinite horizon problem of deterministic control theory, Applied Mathematics & Optimization, 15 (1987), 1-13.
doi: 10.1007/BF01442644. |
[10] |
W. H. Fleming, The convergence problem for differential games, Journal of Mathematical Analysis and Applications, 3 (1961), 102-116.
doi: 10.1016/0022-247X(61)90009-9. |
[11] |
N. Gast, B. Gaujal and J.-Y. Le Boudec, Mean field for markov decision processes: From discrete to continuous optimization, IEEE Transactions on Automatic Control, 57 (2012), 2266-2280.
doi: 10.1109/TAC.2012.2186176. |
[12] |
F. Gensbittel, Continuous-time Limit of Dynamic Games with Incomplete Information and a More Informed Player, hal-00910970, version 1, 2013. |
[13] |
R. Gonzalez and E. Rofman, On deterministic control problems: An approximation procedure for the optimal cost I. the stationary problem, SIAM Journal on Control and Optimization, 23 (1985), 242-266.
doi: 10.1137/0323018. |
[14] |
X. Guo and O. Hernández-Lerma, Nonzero-sum games for continuous-time markov chains with unbounded discounted payoffs, Journal of Applied Probability, 42 (2005), 303-320.
doi: 10.1239/jap/1118777172. |
[15] |
O. Hernández-Lerma and J. B. Laserre, Discrete-Time Markov Control Processes: Basic Optimality criteria, Springer-Verlag, 1996.
doi: 10.1007/978-1-4612-0729-0. |
[16] |
J. Hörner, T. Sugaya, S. Takahashi and N. Vieille, Recursive methods in discounted stochastic games: An algorithm for $\delta\rightarrow 1$ and a folk theorem, Econometrica, 79 (2011), 1277-1318.
doi: 10.3982/ECTA9004. |
[17] |
O. Kallenberg, Foundations of Modern Probability, Springer, New York [u.a.], 2nd edition, 2002.
doi: 10.1007/978-1-4757-4015-8. |
[18] |
I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus, Springer-Verlag, 2nd edition, 2000. |
[19] |
H. J. Kushner, Weak convergence Methods and Singularly Perturbed Stochastic Control and Filtering Problems, Birkhäuser - Systems & Control: Foundations & Applications, Boston, MA, 1990.
doi: 10.1007/978-1-4612-4482-0. |
[20] |
H. J. Kushner and P. Dupuis, Numerical Methods for Stochastic Control Problems in Continuous Time, Springer, New York, 2nd edition, 2001.
doi: 10.1007/978-1-4613-0007-6. |
[21] |
A. Neyman, Stochastic games with short-stage duration, Dynamic Games and Applications, 3 (2013), 236-278.
doi: 10.1007/s13235-013-0083-x. |
[22] |
A. S. Nowak and T. E. S. Raghavan, Existence of stationary correlated equilibria with symmetric information for discounted stochastic games, Mathematics of Operations Research, 17 (1992), 519-526.
doi: 10.1287/moor.17.3.519. |
[23] |
W. H. Sandholm and M. Staudigl, Stochastic stability in the small noise double limit, I: Theory, Unpublished manuscript, University of Wisconsin and Bielefeld University, 2014. |
[24] |
W. H. Sandholm and M. Staudigl, Stochastic stability in the small noise double limit, II: The logit model, Unpublished manuscript, University of Wisconsin and Bielefeld University, 2014. |
[25] |
Y. Sannikov, Games with imperfectly observable actions in continuous time, Econometrica, 75 (2007), 1285-1329.
doi: 10.1111/j.1468-0262.2007.00795.x. |
[26] |
S. Soulaimani, M. Quincampoix and S. Sorin, Repeated games and qualitative differential games: Approachability and comparison of strategies, SIAM Journal on Control and Optimization, 48 (2009), 2461-2479.
doi: 10.1137/090749098. |
[27] |
M. Staudigl, Stochastic stability in asymmetric binary choice coordination games, Games and Economic Behavior, 75 (2012), 372-401.
doi: 10.1016/j.geb.2011.11.003. |
[28] |
M. Staudigl and J.-H. Steg, On repeated games with imperfect public monitoring: From discrete to continuous time, Bielefeld University, unpublished manuscript, 2014. |
[29] |
J. Warga, Optimal Control of Differential and Functional Equations, Academic Press, New York-London, 1972. |
[1] |
A. Mittal, N. Hemachandra. Learning algorithms for finite horizon constrained Markov decision processes. Journal of Industrial and Management Optimization, 2007, 3 (3) : 429-444. doi: 10.3934/jimo.2007.3.429 |
[2] |
Vincent Renault, Michèle Thieullen, Emmanuel Trélat. Optimal control of infinite-dimensional piecewise deterministic Markov processes and application to the control of neuronal dynamics via Optogenetics. Networks and Heterogeneous Media, 2017, 12 (3) : 417-459. doi: 10.3934/nhm.2017019 |
[3] |
Qiuli Liu, Xiaolong Zou. A risk minimization problem for finite horizon semi-Markov decision processes with loss rates. Journal of Dynamics and Games, 2018, 5 (2) : 143-163. doi: 10.3934/jdg.2018009 |
[4] |
Vladimir Turetsky, Valery Y. Glizer. Optimal decision in a Statistical Process Control with cubic loss. Journal of Industrial and Management Optimization, 2022, 18 (4) : 2903-2926. doi: 10.3934/jimo.2021096 |
[5] |
Alexander Mielke. Weak-convergence methods for Hamiltonian multiscale problems. Discrete and Continuous Dynamical Systems, 2008, 20 (1) : 53-79. doi: 10.3934/dcds.2008.20.53 |
[6] |
N. U. Ahmed. Weak solutions of stochastic reaction diffusion equations and their optimal control. Discrete and Continuous Dynamical Systems - S, 2018, 11 (6) : 1011-1029. doi: 10.3934/dcdss.2018059 |
[7] |
Bin Wang, Lin Mu. Viscosity robust weak Galerkin finite element methods for Stokes problems. Electronic Research Archive, 2021, 29 (1) : 1881-1895. doi: 10.3934/era.2020096 |
[8] |
Marzia Bisi, Maria Groppi, Giorgio Martalò, Romina Travaglini. Optimal control of leachate recirculation for anaerobic processes in landfills. Discrete and Continuous Dynamical Systems - B, 2021, 26 (6) : 2957-2976. doi: 10.3934/dcdsb.2020215 |
[9] |
Roberto C. Cabrales, Gema Camacho, Enrique Fernández-Cara. Analysis and optimal control of some solidification processes. Discrete and Continuous Dynamical Systems, 2014, 34 (10) : 3985-4017. doi: 10.3934/dcds.2014.34.3985 |
[10] |
Kamil Aida-Zade, Jamila Asadova. Numerical solution to optimal control problems of oscillatory processes. Journal of Industrial and Management Optimization, 2021 doi: 10.3934/jimo.2021166 |
[11] |
Zhen Lei, Yi Zhou. BKM's criterion and global weak solutions for magnetohydrodynamics with zero viscosity. Discrete and Continuous Dynamical Systems, 2009, 25 (2) : 575-583. doi: 10.3934/dcds.2009.25.575 |
[12] |
Karl Kunisch, Markus Müller. Uniform convergence of the POD method and applications to optimal control. Discrete and Continuous Dynamical Systems, 2015, 35 (9) : 4477-4501. doi: 10.3934/dcds.2015.35.4477 |
[13] |
Wael Bahsoun, Paweł Góra. SRB measures for certain Markov processes. Discrete and Continuous Dynamical Systems, 2011, 30 (1) : 17-37. doi: 10.3934/dcds.2011.30.17 |
[14] |
Artur Stephan, Holger Stephan. Memory equations as reduced Markov processes. Discrete and Continuous Dynamical Systems, 2019, 39 (4) : 2133-2155. doi: 10.3934/dcds.2019089 |
[15] |
Naïla Hayek. Infinite-horizon multiobjective optimal control problems for bounded processes. Discrete and Continuous Dynamical Systems - S, 2018, 11 (6) : 1121-1141. doi: 10.3934/dcdss.2018064 |
[16] |
James Broda, Alexander Grigo, Nikola P. Petrov. Convergence rates for semistochastic processes. Discrete and Continuous Dynamical Systems - B, 2019, 24 (1) : 109-125. doi: 10.3934/dcdsb.2019001 |
[17] |
H.Thomas Banks, Shuhua Hu. Nonlinear stochastic Markov processes and modeling uncertainty in populations. Mathematical Biosciences & Engineering, 2012, 9 (1) : 1-25. doi: 10.3934/mbe.2012.9.1 |
[18] |
Angelica Pachon, Federico Polito, Costantino Ricciuti. On discrete-time semi-Markov processes. Discrete and Continuous Dynamical Systems - B, 2021, 26 (3) : 1499-1529. doi: 10.3934/dcdsb.2020170 |
[19] |
Thomas Kruse, Mikhail Urusov. Approximating exit times of continuous Markov processes. Discrete and Continuous Dynamical Systems - B, 2020, 25 (9) : 3631-3650. doi: 10.3934/dcdsb.2020076 |
[20] |
Xian Chen, Zhi-Ming Ma. A transformation of Markov jump processes and applications in genetic study. Discrete and Continuous Dynamical Systems, 2014, 34 (12) : 5061-5084. doi: 10.3934/dcds.2014.34.5061 |
Impact Factor:
Tools
Metrics
Other articles
by authors
[Back to Top]