PPO-EWMA

Status: Archive (code is provided as-is, no updates expected)

PPO-EWMA

[Paper]

This is code for training agents using PPO-EWMA and PPG-EWMA, introduced in the paper Batch size-invariance for policy optimization (citation). It is based on the code for Phasic Policy Gradient.

Installation

Supported platforms: MacOS and Ubuntu, Python 3.7

Installation using Miniconda:

git clone https://github.com/openai/ppo-ewma.git
conda env update --name ppo-ewma --file ppo-ewma/environment.yml
conda activate ppo-ewma
pip install -e ppo-ewma

Alternatively, install the dependencies from environment.yml manually.

Visualize results

Results are stored in blob storage at https://openaipublic.blob.core.windows.net/rl-batch-size-invariance/, and can be visualized as in the paper using this Colab notebook.

Citation

Please cite using the following BibTeX entry:

@article{hilton2021batch,
  title={Batch size-invariance for policy optimization},
  author={Hilton, Jacob and Cobbe, Karl and Schulman, John},
  journal={arXiv preprint arXiv:2110.00641},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
notebooks		notebooks
ppo_ewma		ppo_ewma
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

ppo_ewma

ppo_ewma

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

environment.yml

environment.yml

setup.py

setup.py

Repository files navigation

PPO-EWMA

[Paper]

Installation

Visualize results

Citation

About

Releases

Packages

Languages

License

openai/ppo-ewma

Folders and files

Latest commit

History

Repository files navigation

PPO-EWMA

Installation

Visualize results

Citation

About

Resources

License

Stars

Watchers

Forks

Languages