Structure and Randomness in Planning and Reinforcement Learning

Konrad Czechowski; Piotr Januszewski; Piotr Kozakowski; Łukasz Kuciński; Piotr Miłoś

doi:10.1109/ijcnn52387.2021.9533317

Structure and Randomness in Planning and Reinforcement Learning

Abstrakt

Planning in large state spaces inevitably needs to balance the depth and breadth of the search. It has a crucial impact on the performance of a planner and most manage this interplay implicitly. We present a novel method \textit{Shoot Tree Search (STS)}, which makes it possible to control this trade-off more explicitly. Our algorithm can be understood as an interpolation between two celebrated search mechanisms: MCTS and random shooting. It also lets the user control the bias-variance trade-off, akin to TD(n), but in the tree search context. In experiments on challenging domains, we show that STS can get the best of both worlds consistently achieving higher scores.

Cytowania

0

CrossRef
0

Web of Science
1

Scopus

Autorzy (5)

Konrad Czechowski magister
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw
Piotr Januszewski
Piotr Kozakowski magister
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw
Łukasz Kuciński doktor
- Institute of Mathematics, Polish Academy of Science
Piotr Miłoś doktor habilitowany
- Institute of Mathematics, Polish Academy of Science