Overview
Citing
Installation
Checking
Convenience
Troubleshooting
SimuRLacra (composed of the two modules Pyrado and RcsPySim) is a Python/C++ framework for reinforcement learning from randomized physics simulations. The focus is on robotics tasks with mostly continuous control. It features randomizable simulations written in standalone Python (no license required) as well as simulations driven by the physics engines Bullet (no license required), Vortex (license required), or MuJoCo (license required).
Pros
- Exceptionally modular treatment of environments via wrappers. The key idea behind this was to be able to quickly modify and randomize all available simulation environments. Moreover, SimuRLacra contains unique environments that either run completely in Python or allow you to switch between the Bullet or Vortex (requires license) physics engine.
- C++ export of policies based on PyTorch Modules. Since the
Policy
class is a subclass of PyTorch'snn.Module
, you can port your neural-network policies, learned with Python, to you C++ applications. This also holds for stateful recurrent networks. - CPU-based parallelization for sampling the environments. Similar to the OpenAI Gym, SimuRLacra offers parallelized environments for sampling. This is done by employing Serializable, making the simulation environments fully pickleable.
- Separation of the exploration strategies and the policy. Instead of having a GaussianFNN and a GaussianRNN ect. policy, you can wrap your policy architectures with (almost) any exploration scheme. At test time, you simple strip the exploration wrapper.
- Tested integration of real-world Quanser platforms. This feature is extremely valuable if you want to conduct sim-to-real research, since you can simply replace the simulated environment with the physical one by changing one line of code.
- Tested integration of BoTorch, and Optuna.
- Detailed documentation.
Cons
- No vision-based environments/tasks. In principle there is nothing stopping you from integrating computer vision into SimuRLacra. However, I assume there are better suited frameworks out there.
- Without bells and whistles. Most implementations (especially the algorithms) do not focus on performance. After all, this framework was created to understand and prototype things.
- Hyper-parameters are not fully tuned. Sometimes the most important part of reinforcement learning is the time-consuming search for the right hyper-parameters. I only did this for the environment-algorithm combinations reported in my papers. But, for all the other cases there is Optuna and some optuna-based example scripts that you can start from.
- Unfinished GPU-support. At the moment the porting of the policies is implemented but not fully tested. The GPU-enabled re-implementation of the simulation environments in the pysim folder (simple Python simulations) is at question. The environments based on Rcs which require the Bullet or Vortex physics engine will only be able to run on CPU.
SimuRLacra was tested on Ubuntu 16.04 (deprecated), 18.04 (recommended), and 20.04, with PyTorch 1.4 (deprecated) and 1.7. The part without C++ dependencies, called Pyrado, also works under Windows 10 (not supported).
Not the right framework for you?
- If you are looking for even more modular code or simply want to see how much you can do with Python decorators, check out vel. It is a beautiful framework that includes more than reinforcement learning.
- If you need code optimized for performance, check out stable baselines. I know, that was captain obvious.
- If you are missing value-based algorithms will bells and whistles, check out MushroomRL. The main contributor is good at every sport. Sorry Carlo, but the world has to know it.
If you use code or ideas from this project for your research, please cite SimuRLacra.
@misc{Muratore_SimuRLacra,
author = {Fabio Muratore},
title = {SimuRLacra - A Framework for Reinforcement Learning from Randomized Simulations},
year = {2020},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/famura/SimuRLacra}}
}
It is recommended to install SimuRLacra in a separate virtual environment such as anaconda. Follow the instructions on the anaconda homepage to download the anaconda (or miniconda) version for your machine (andaconda 3 is recommended).
Clone the repository and go to the project's directory
git clone https://github.com/famura/SimuRLacra.git
# or via ssh
# git clone [email protected]:famura/SimuRLacra.git
cd SimuRLacra
Create an anaconda environment (without PyTorch)
conda create -y -n pyrado python=3.7
conda activate pyrado
conda install -y blas cmake lapack libgcc-ng mkl patchelf pip pycairo setuptools -c conda-forge
pip install argparse box2d colorama coverage cython glfw gym joblib prettyprinter matplotlib numpy optuna pandas pytest pytest-cov pytest-xdist pyyaml scipy seaborn sphinx sphinx-math-dollar sphinx_rtd_theme tabulate tensorboard tqdm vpython git+https://github.com/Xfel/init-args-serializer.git@master
Any warnings from VPython can be safely ignored.
If you just want to have a look at SimuRLacra, or don't care about the Rcs-based robotics part, I recommend going for Red Velvet. However, if you for example want to export your learned controller to a C++ program runnning on a phsical robot, I recommend Black Forest. Here is an overview of the options:
Options | PyTorch build | Policy export to C++ | CUDA support | Rcs-based simulations (RcsPySim) | Python-based simulations (Pyrado) | (subset of) mujoco-py simulations |
---|---|---|---|---|---|---|
Red Velvet | pip | ❌ | ✔️ | ❌ | ✔️ | ✔️ |
Malakoff | local | ✔️ | ❌ | ❌ | ✔️ | ✔️ |
Sacher | pip | ❌ | ✔️ | ✔️ | ✔️ | ✔️ |
Black Forest | local | ✔️ | ❌ | ✔️ | ✔️ | ✔️ |
Please note that the Vortex (optionally used in RcsPySim) as well as the MuJoCo (mandatory for mujoco-py) physics engine require a license.
Please note that building PyTorch locally from source will take about 30-60 min.
In all cases you will download Rcs, eigen3, pybind11, catch2, and mujoco-py, into the thirdParty
directory as git submodules. Rcs will be placed in the project's root directory.
Run (the setup script calls git submodule init
and git submodule update
)
conda activate pyrado
pip install torch==1.7.0
# or if CUDA support not needed
# pip install torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
python setup_deps.py wo_rcs_wo_pytorch -j8
In case this process crashes, please first check the Troubleshooting section below.
Run (the setup script calls git submodule init
and git submodule update
)
conda activate pyrado
python setup_deps.py wo_rcs_w_pytorch -j8
In case this process crashes, please first check the Troubleshooting section below.
Infrastructure dependent: install libraries system-wide
Parts of this framework create Python bindings of Rcs called RcsPySim. Running Rcs requires several libraries which can be installed (requires sudo rights) via
python setup_deps.py dep_libraries
This command will install g++-4.8
, libqwt-qt5-dev
, libbullet-dev
, libfreetype6-dev
, libxml2-dev
, libglu1-mesa-dev
, freeglut3-dev
, mesa-common-dev
, libopenscenegraph-dev
, openscenegraph
, and liblapack-dev
.
In case you have no sudo rights, but want to use all the Rcs-dependent environments, you can try installing the libraries via anaconda. For references, see the comments behind required_packages
in setup_deps.py
.
If you can't install the libraries, you can still use the Python part of this framework called Pyrado, but no environments in the rcspysim
folder.
Run (the setup script calls git submodule init
and git submodule update
)
conda activate pyrado
pip install torch==1.7.0
# or if CUDA support not needed
# pip install torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html
python setup_deps.py w_rcs_wo_pytorch -j8
In case this process crashes, please first check the Troubleshooting section below.
Infrastructure dependent: install libraries system-wide
Parts of this framework create Python bindings of Rcs called RcsPySim. Running Rcs requires several libraries which can be installed (requires sudo rights) via
python setup_deps.py dep_libraries
This command will install g++-4.8
, libqwt-qt5-dev
, libbullet-dev
, libfreetype6-dev
, libxml2-dev
, libglu1-mesa-dev
, freeglut3-dev
, mesa-common-dev
, libopenscenegraph-dev
, openscenegraph
, and liblapack-dev
.
In case you have no sudo rights, but want to use all the Rcs-dependent environments, you can try installing the libraries via anaconda. For references, see the comments behind required_packages
in setup_deps.py
.
If you can't install the libraries, you can still use the Python part of this framework called Pyrado, but no environments in the rcspysim
folder.
Run (the setup script calls git submodule init
and git submodule update
)
conda activate pyrado
python setup_deps.py w_rcs_w_pytorch -j8
In case this process crashes, please first check the Troubleshooting section below.
In case you are at IAS and want to use you SL and robcom, you can set them up (requires sudo rights) with
python setup_deps.py robcom -j8
After that you still need to install the robot-specific package in SL.
conda activate pyrado
conda env list
conda list | grep torch # check if the desired version of PyTorch is installed
python --version # should return Python 3.7.X :: Anaconda, Inc._
To exemplarily check basic Pyrado environments (implemented in Python without dependencies to RcsPySim)
conda activate pyrado
cd PATH_TO/SimuRLacra/Pyrado/scripts
python sandbox/sb_qcp.py --env_name qcp-su --dt 0.002
Quickly check the environments interfacing Rcs via RcsPySim
python sandbox/sb_qq_rcspysim.py
If this does not work it may be because Vortex or Bullet is not installed.
For deeper testing, run Pyrado's unit tests
cd PATH_TO/SimuRLacra/Pyrado/tests
pytest -v -m "not longtime"
If not already activated, execute
conda activate pyrado
Build both html documentations
cd PATH_TO/SimuRLacra
./build_docs.sh
This will fail if you did not set up RcsPySim.
RcsPySim
firefox RcsPySim/build/doc/html/index.html
Pyrado
firefox Pyrado/doc/build/index.html
You will find yourself often in the same folders, so adding the following aliases to your shell's rc-file will be worth it.
alias cds='cd PATH_TO/SimuRLacra'
alias cdps='cd PATH_TO/SimuRLacra/Pyrado/scripts'
alias cdpt='cd PATH_TO/SimuRLacra/Pyrado/data/temp'
alias cdrps='cd PATH_TO/SimuRLacra/RcsPySim/build'
alias cdrcs='cd PATH_TO/SimuRLacra/Rcs/build'
Assuming that you use an IDE (in this case CLion), it is nice to put an empty CMakeLists.txt
into the Python part of your project (here Pyrado) and include this as a subdirectory from the C++ part of your project by adding
add_subdirectory(../Pyrado "${CMAKE_BINARY_DIR}/pyrado")
If you then create a project in the RcsPySim directory, your IDE will automatically add Pyrado for you. If you moreover mark Pyrado as sources root
(CLion specific), it will be parsed by the IDE's git tool.
I also suggest to create run configuration that always build the C++ part (RcsPySim) before executing a Python script.
In CLion or example, you go Run->Edit Configurations ...
, select CMake Application
, hit the plus, select _rcsenv
as target and python
as executable, make your program arguments a module call like -m scripts.sandbox.sb_p3l
in connection with the correct working directory PATH_TO/SimuRLacra/Pyrado
, and most importantly select Build
in the Before launch
section.
In a similar fashion, you can directly call Rcs. This is useful when you are creating a new environment and want to iterate the graph xml-file.
In CLion or example, you go Run->Edit Configurations ...
, select CMake Application
, hit the plus, select _rcsenv
as target and Rcs
as executable, pass Rcs-specific arguments to your program arguments like -m 4 -dir PATH_TO/SimuRLacra/RcsPySim/config/Planar3Link/ -f gPlanar3Link.xml
in connection with the correct working directory PATH_TO/SimuRLacra/Rcs/build
, and select Build
in the Before launch
section.
There are many more command line arguments for Rcs. Look for argP
in the Rcs.cpp source file.
To look at the training report in detail from console, I recommend to put
function pretty_csv {
column -t -s, -n "$@" | less -F -S -X -K
}
into your sell's rc-file. Executing pretty_csv progress.csv
in the experiments folder will yield a nicely formatted table.
I found this neat little trick on Stefaan Lippens blog. You might need to install column
depending on your OS.
Depending on the libraries install on your machine, you might receive the linker error undefined reference to inflateValidate@ZLIB_1.2.9
while building Rcs or RcsPySim.
In otder to solve this error, link the z library to the necessary targets by editing the PATH_TO/SimuRLacra/Rcs/bin/CMakeLists.txt
replacing
TARGET_LINK_LIBRARIES(Rcs RcsCore RcsGui RcsGraphics RcsPhysics)
by
TARGET_LINK_LIBRARIES(Rcs RcsCore RcsGui RcsGraphics RcsPhysics z)
and
TARGET_LINK_LIBRARIES(TestGeometry RcsCore RcsGui RcsGraphics RcsPhysics)
by
TARGET_LINK_LIBRARIES(TestGeometry RcsCore RcsGui RcsGraphics RcsPhysics z)
The same goes for PATH_TO/SimuRLacra/Rcs/examples/CMakeLists.txt
where you replace
TARGET_LINK_LIBRARIES(ExampleForwardKinematics RcsCore RcsGui RcsGraphics)
by
TARGET_LINK_LIBRARIES(ExampleForwardKinematics RcsCore RcsGui RcsGraphics z)
and
TARGET_LINK_LIBRARIES(ExampleKinetics RcsCore RcsGui RcsGraphics RcsPhysics)
by
TARGET_LINK_LIBRARIES(ExampleKinetics RcsCore RcsGui RcsGraphics RcsPhysics z)
By default, the sampling (on CPU) in Pyrado is parallelized using PyTorch's multiprocessing module. Thus, your debuggner will not be connected to the right process. Rerun your script with num_sampler_envs=1
passed as a parameter to the algorithm, that will then contruct a sampler wich only uses one process.
If you are using Vortex, which itself has a Qt5-based GUI, RcsPySim may look for the wrong libpng
version. Make sure that if finds the same one as Rcs (libpng16.so
) and not the one from Vortex (libpng15.so
). You can investigate this using the ldd
(or lddtree
if installed) command on the generated RcsPySim executables.
An easy fix is to go to your Vortex library directory and move all Qt5-related libs to a newly generated folder, such that they cant be found. This solution is perfectly fine since we are not using the Vortex GUI anyway. Next, clear the RcsPySim/build
folder and build it again.
Check Rcs with which precision Bullet was build
cd PATH_TO/SimuRLacra/thirdParty/Rcs/build
ccmake .
Use the same in RcsPySim
cd PATH_TO/SimuRLacra/RcsPySim/build
ccmake .
Rebuild RcsPySim (with activated anaconda env)
cd PATH_TO/SimuRLacra/RcsPySim/build
make -j12
ModuleNotFoundError: No module named 'init_args_serializer'
Install it from
git+https://github.com/Xfel/init-args-serializer.git@master
When you export the anaconda environment, the yml-file will contain the line init-args-serializer==1.0
. This will cause an error when creating a new anaconda environment from this yml-file. To fix this, replace the line with git+https://github.com/Xfel/init-args-serializer.git@master
.
You run a script and get ImportError: cannot import name 'export'
? Check if your PyTorch version is >= 1.2. If not, update via
cd PATH_TO/SimuRLacra
python setup_deps.py pytorch -j12
or install the pre-compiled version form anaconda using
conda install pytorch torchvision cpuonly -c pytorch
Note: if you choose the latter, the C++ export of policies will not work.
If you receive PATH_TO/anaconda3/envs/pyrado/bin/python: can't open file 'setup.py': [Errno 2] No such file or directory
while executing python setup_deps pytorch
, delete the thirdParty/pytorch
and run
cd PATH_TO/SimuRLacra
python setup_deps.py pytorch -j12
Option 1: if you have sudo rights, run
sudo apt-get install libopenblas-dev
and then rebuild PyTorch from scratch. Option 2: if you don't have sudo rights, run
conda install -c conda-forge lapack
and then rebuild PyTorch from scratch.
Run the setup_deps.py
scripts again with --local_torch
, or explicitly set USE_LIBTORCH = ON
for the cmake arguments of RcsPySim
cd PATH_TO/SimuRLacra/Rcs/build
ccmake . # set the option, configure (2x), and generate
The Pytorch setup script (thirdParty/pytorch/setup.py) determines the number of cpus to compile automatically. It can be overridden by setting the environment variable MAX_JOBS:
export MAX_JOBS=1
Please use your shell syntax accordingly (the above example is for bash).
Download mujoco200 linux
from the official page and extract it to ~/.mujoco
such that you have ~/.mujoco/mujoco200
. Put your MuJoCo license file in ~/.mujoco
.
During executing setup_deps.py
, mujoco-py is set up as a git submodule and installed via the downloaded setup.py
.
If this fails, have a look at the mujoco-py's canonical dependencies. Try again. If you get an error mentioning patchelf
, run conda install -c anaconda patchelf
In case you get visualization errors related to GLEW
(render causes a frozen window and crashes, or simply a completely black screen) add export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
to your shell's rc-file (like ~/.bashrc
).
If you now create a new terminal, it should work. If not, try sudo apt-get install libglew-dev
.
If you dont have a MuJoCo license, or MuJoCo is not installed on zour machine, mujoco-py will print an error message. One way to avoid this would be to not install mujoco-py by default. However, this would create even more options above. Thus, we will just fool mujoco-py's checker by creating a fake directory and an empty license file.
mkdir /$HOME/.mujoco/mujoco200 -p && touch /$HOME/.mujoco/mjkey.txt
This error might come from the scipy.signal.lfilter command (eventually including scipy's fft function). For scipy versions > 1.5.2, this requires GLIBCXX_3.4.22. If your computer is out -of-date and you have no sudo rights, your best option is to set scipy pack to version 1.5.2.
conda activate pyrado
conda remove scipy --force
pip install scipy==1.5.2