Solving Imperfect Information Poker Games Using Monte Carlo Search and POMDP Models

2020 
Recent advances achieved in the field of reinforcement learning have led AI algorithms capable of beating world champions in some perfect information games like Chess and Go. However, the AI approach to imperfect information games (such as Poker) is much more difficult because the complexities in estimating hidden information and behaviors of opponents may become extremely challenging. Since Markov Decision Process (MDP) is the underlying mathematical model of reinforcement learning with perfect information games, Partially Observable Markov Decision Process (POMDP) deserves research attention for studying the games with imperfect information. In this paper, we study a 16-cards Rhode Island Hold’em poker game and present a POMDP model to formulate this imperfect information extensive game. Based on the POMDP model, we use Bayesian approach to estimate the opponent’s hand and transform the original problem to several perfect information games. Furthermore, to handle the challenge of explosively huge storage space and computation burdens, we develop a Monte Carlo optimization algorithm to estimate the action values of the POMDP model. Finally, we conduct numerical experiments in the Rhode Island Hold’em poker game to demonstrate the effectiveness of our approach.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    11
    References
    1
    Citations
    NaN
    KQI
    []