MAGIC: Learning Macro-Actions for Online POMDP Planning

Yiyuan Lee; Panpan Cai; David Hsu

Robotics: Science and Systems XVII

MAGIC: Learning Macro-Actions for Online POMDP Planning

Yiyuan Lee, Panpan Cai, David Hsu

Abstract:

The partially observable Markov decision process (POMDP) is a principled general framework for robot decision making under uncertainty; but POMDP planning suffers from high computational complexity; when long-term planning is required. While temporally-extended macro-actions help to cut down the effective planning horizon and significantly improve computational efficiency; how do we acquire good macro-actions? This paper proposes Macro-Action Generator-Critic (MAGIC); which performs offline learning of macro-actions optimized for online POMDP planning. Specifically; MAGIC learns a macro-action generator end-to-end; using an online planner's performance as the feedback. During online planning; the generator generates on the fly situation-aware macro-actions conditioned on the robot's belief and the environment context. We evaluated MAGIC on several long-horizon planning tasks both in simulation and on a real robot. The experimental results show that the learned macro-actions offer significant benefits in online planning performance; compared with primitive actions and handcrafted macro-actions.

Download:

Bibtex:

  
@INPROCEEDINGS{Lee-RSS-21, 
    AUTHOR    = {Yiyuan Lee AND Panpan Cai AND David Hsu}, 
    TITLE     = {{MAGIC: Learning Macro-Actions for Online POMDP Planning }}, 
    BOOKTITLE = {Proceedings of Robotics: Science and Systems}, 
    YEAR      = {2021}, 
    ADDRESS   = {Virtual}, 
    MONTH     = {July}, 
    DOI       = {10.15607/RSS.2021.XVII.041} 
}