[PDF][PDF] Markovian State and Action Abstractions for MDPs via Hierarchical MCTS.

A Bai, S Srivastava, S Russell - IJCAI, 2016 - aijunbai.github.io
… • The performance loss of POMCP(M,ϕ) is bounded by a constant multiple of an
aggregation error introduced by grouping states with different optimal actions • POMCP(M,ϕ,O)
converges to a recursively optimal hierarchical policy for POMDP(M,ϕ) over the hierarchy
defined by input state and action abstractions … With probability 1, POMCP(M,ϕ,O)
converges to a recursively optimal hierarchical policy for POMDP(M,ϕ) over the hierarchy
defined by the input state and action abstractions. …
Showing the best result for this search. See all results