The Actor Critic Structure in MAA2C #5

rezunli96 · 2019-03-10T05:43:32Z

A little confused about your implementation of MAA2C. I don't think the input of the actor network is simply the ``joint state" of the agents. According to [1] the critic's input should be state of the environment (where agents' joint state is not necessarily defined) + the joint action of the agents, i.e., the critic here should be a Q-function for joint actions. And for the actor it should be something like a policy, where I am not quite understand why the actor network is implemented in this way. Appreciate if explained.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Actor Critic Structure in MAA2C #5

The Actor Critic Structure in MAA2C #5

rezunli96 commented Mar 10, 2019

The Actor Critic Structure in MAA2C #5

The Actor Critic Structure in MAA2C #5

Comments

rezunli96 commented Mar 10, 2019