Authors:
Dennis J. N. J. Soemers, Vegard Mella, Eric Piette, Matthew Stephenson, Cameron Browne, Olivier Teytaud

Venue:
Transactions on Machine Learning Research (TMLR), 2023

Topics:
transfer learning, reinforcement learning, policy-value networks, general game playing, zero-shot learning

Links: PDF · OpenReview

Abstract

This paper investigates how to transfer policy-value networks between different games, including games with different board sizes, shapes, rules, and action spaces.

The approach leverages the Ludii general game system and its high-level game description language to identify shared semantics between games and enable transfer across domains.

The proposed method supports both zero-shot transfer and transfer with fine-tuning, and experiments across hundreds of game pairings demonstrate that effective transfer is often possible—even between substantially different games.

Context

This work addresses a key limitation of deep reinforcement learning: the lack of generalisation and transfer across tasks.

By focusing on games described in a shared domain-specific language (DSL), the paper shows how structural information about tasks can be exploited to guide transfer between them.

A central contribution is a simple yet effective method for transferring fully convolutional policy-value networks, including mappings between state and action representations across domains.

The results highlight that training on simpler or smaller games and transferring to more complex ones can sometimes outperform direct training on the target game, especially in zero-shot settings.

Full reference

Soemers, D. J. N. J., Mella, V., Piette, E., Stephenson, M., Browne, C., Teytaud, O. (2023). Towards a General Transfer Approach for Policy-Value Networks. Transactions on Machine Learning Research.

BibTeX

@article{soemers2023transfer,
  author  = {Soemers, Dennis J. N. J. and Mella, Vegard and Piette, Eric and Stephenson, Matthew and Browne, Cameron and Teytaud, Olivier},
  title   = {Towards a General Transfer Approach for Policy-Value Networks},
  journal = {Transactions on Machine Learning Research},
  year    = {2023},
  url     = {https://openreview.net/forum?id=vJcTm2v9Ku}
}