Policy Learning Landscape

This repository contains code to explore the policy optimiaztion landscape.

Quick setup

To run cartpole simply do:

python3 run_eager_policy_optimization.py --env CartPole-v0 --policy_type discrete

To run something from Mujoco you must have it installed and the associated license. To run Hopper-v1 use:

python3 run_eager_policy_optimization.py --env Hopper-v1 --policy_type normal --std 0.5

Parameters will be saved into ./parameters as numpy files. After obtaining some parameters from different runs use the following commands to analyze the landscape.

First install eager_pg: pip install -e ..
Random Pertubations Experiment:

cd interpolation_experiments
python paired_random_directions_experiment.py --p1 ./path/to/parameter/1/npy \
--save_dir ./path/to/save/in/ \
--alpha 0.5 --std 0.5 --n_directions 500

Linear Interpolation Experiment:

cd interpolation_experiments
python simple_1d_interpolation_experiment.py --p1 ./path/to/parameter/1/npy \
--p2 ./path/to/parameter/2/npy --save_dir ./path/to/save/in/ \
--stds 5.0 --alpha_start -0.5 --alpha_end 1.5 --n_alphas 2 \
--save_dir ./path/to/save/in

Note that interpolation tools only work with continuous policies.

Code organization

eager_pg: contains a small library to enable quick research in policy gradient reinforcement learning.
analysis_tools: contains tooling to make nice figures in papers.
interpolation_experiments: Experiments to explore the landscape in policy optimization.

Citation

If you use the proposed method or code, we'd appreciate if you could cite this work!

@article{ahmed2018understanding,
  title={Understanding the impact of entropy in policy learning},
  author={Ahmed, Zafarali and Roux, Nicolas Le and Norouzi, Mohammad and Schuurmans, Dale},
  journal={arXiv preprint arXiv:1811.11214},
  year={2018}
}

Disclaimer

This is not an official Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
analysis_tools		analysis_tools
eager_pg		eager_pg
interpolation_experiments		interpolation_experiments
notebooks		notebooks
.gitignore		.gitignore
CONTRIBUTING		CONTRIBUTING
LICENSE		LICENSE
README.md		README.md
run_eager_policy_optimization.py		run_eager_policy_optimization.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Policy Learning Landscape

Quick setup

Code organization

Citation

Disclaimer

About

Releases

Packages

Languages

License

google-research/policy-learning-landscape

Folders and files

Latest commit

History

Repository files navigation

Policy Learning Landscape

Quick setup

Code organization

Citation

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages