Citation: |
[1] |
H. S. Chang, Perfect information two-person zero-sum Markov games with imprecise transition probabilities, Math. Meth. Oper. Res., 64 (2006), 335-351.doi: 10.1007/s00186-006-0081-5. |
[2] |
J. I. González-Trejo, O. Hernández-Lerma and L. F. Hoyos-Reyes, Minimax control of discrete-time stochastic systems, SIAM J. Control Optim., 41 (2003), 1626-1659.doi: 10.1137/S0363012901383837. |
[3] |
E. I. Gordienko and J. A. Minjárez-Sosa, Adaptive control for discrete-time Markov processes with unbounded costs: Discounted criterion, Kybernetika (Prague), 34 (1998), 217-234. |
[4] |
M. K. Ghosh, D. McDonald and S. Sinha, Zero-sum stochastic games with partial information, J. Optimiz. Theory Appl., 121 (2004), 99-118.doi: 10.1023/B:JOTA.0000026133.56615.cf. |
[5] |
O. Hernández-Lerma and J. B. Lasserre, "Discrete-Time Markov Control Processes. Basic Optimality Criteria," Applications of Mathematics (New York), 30, Springer-Verlag, New York, 1996. |
[6] |
O. Hernández-Lerma and J. B. Lasserre, "Further Topics on Discrete-Time Markov Control Processes," Applications of Mathematics (New York), 42, Springer-Verlag, New York, 1999. |
[7] |
O. Hernández-Lerma and J. B. Lasserre, Zero-sum stochastic games in Borel spaces: Average payoff criteria, SIAM J. Control Optim., 39 (2001), 1520-1539.doi: 10.1137/S0363012999361962. |
[8] |
A. Jaśkiewicz and A. Nowak, Zero-sum ergodic stochastic games with Feller transition probabilities, SIAM J. Control Optim., 45 (2006), 773-789.doi: 10.1137/S0363012904443257. |
[9] |
A. Krausz and U. Rieder, Markov games with incomplete information, Math. Meth. Oper. Res., 46 (1997), 263-279.doi: 10.1007/BF01217695. |
[10] |
H.-U. Küenle, On Markov games with average reward criterion and weakly continuous transition probabilities, SIAM J. Control Optim., 45 (2007), 2156-2168.doi: 10.1137/040617303. |
[11] |
E. L. Lehmann and G. Casella, "Theory of Point Estimation," Second edition, Springer-Verlag, New York, 1998. |
[12] |
F. Luque-Vásquez, Zero-sum semi-Markov games in Borel spaces: Discounted and average payoff, Bol. Soc. Mat. Mexicana (3), 8 (2002), 227-241. |
[13] |
J. A. Minjárez-Sosa and F. Luque-Vásquez, Two person zero-sum semi-Markov games with unknown holding times distribution in one side: A discounted payoff criterion, Appl. Math. Optim., 57 (2008), 289-305.doi: 10.1007/s00245-007-9016-7. |
[14] |
J. A. Minjárez-Sosa and O. Vega-Amaya, Asymptotically optimal strategies for adaptive zero-sum discounted Markov games, SIAM J. Control Optim., 48 (2009), 1405-1421.doi: 10.1137/060651458. |
[15] |
K. Najim, A. S. Poznyak and E. Gómez, Adaptive policy for two finite Markov chains zero-sum stochastic game with unknown transition matrices and average payoffs, Automatica J. IFAC, 37 (2001), 1007-1018.doi: 10.1016/S0005-1098(01)00050-4. |
[16] |
N. Shimkin and A. Shwartz, Asymptotically efficient adaptive strategies in repeated games. I. Certainty equivalence strategies, Math. Oper. Res., 20 (1995), 743-767.doi: 10.1287/moor.20.3.743. |
[17] |
N. Shimkin and A. Shwartz, Asymptotically efficient adaptive strategies in repeated games. II. Asymptotic optimality, Math. Oper. Res., 21 (1996), 487-512.doi: 10.1287/moor.21.2.487. |
[18] |
J. A. E. E. Van Nunen and J. Wessels, A note on dynamic programming with unbounded rewards, Manag. Sci., 24 (1978), 576-580. |