Skip to content

Latest commit

 

History

History
23 lines (12 loc) · 608 Bytes

readme.md

File metadata and controls

23 lines (12 loc) · 608 Bytes

Classical Reinforcement Learning Algorithms

Asynchronous Methods for Deep Reinforcement Learning

Propose A3C and some asynchronous parallel algorthims for RL.

Proximal Policy Optimization Algorithms

Propose PPO.

Implicit Quantile Networks for Distributional Reinforcement Learning

Propose IQN.

Evaluating Agents without Rewards

Proposed by Hafner who proposes Dreamer, PlaNet. The paper seems to be interesting.

Discovering Reinforcement Learning Algorithms

Agent57 Outperforming the Atari Human Benchmark

Propose Agent57 and outperforms than human baseline in all Atari games.