How does Selfplay work with torchrl? #2201

TheRisenPhoenix · 2024-06-04T08:18:24Z

TheRisenPhoenix
Jun 4, 2024

In my understanding, classical self-play describes the process of training an agent against an older version of itself. Take a competitive game with two agents, this would mean that one agent is trained as usual, while the other one is fixed to an older version of the policy. Every now and then, its policy is updated. By doing so, the difficulty of the environment increases over time and the agent always plays against a suitable opponent for its current skill level.

If I understood it correctly, torchrl currently doesn't feature such a functionality. It is possible to use multiagent settings, but then both agents either always share exactly the same policy, or learn independently from each other (but with the same architecture).

Is this correct, or did I overlook something?
Is there any supposed way of implementing this, besides doing single-agent learning and manually managing and updating the opponent?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does Selfplay work with torchrl? #2201

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

How does Selfplay work with torchrl? #2201

TheRisenPhoenix Jun 4, 2024

Replies: 0 comments

TheRisenPhoenix
Jun 4, 2024