Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Last update: Nov 16, 2022

Related tags

Overview

Population-Based Bandits (PB2)

Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits.

The framework is based on a union of ray (using rllib and tune) and GPy. Heavily inspired by the ray tune pbt_ppo example.

NOTE PB2 is included in the ray.tune library, which is the official supported implementation. The link to the code is here, and the accompanying blog post is here.

Running the Code

To run the IMPALA experiment, use command:

python run_impala.py

To run the PPO experiment, use command:

python run_ppo.py

Config

Within that function, there are multiple ways to mix it up. You can choose the following:

-env_name: for example BreakoutNoFrameSkip-v4.
-method: either pb2 or pbt (or asha for PPO).
-freq: the frequency of updating hyperparams, we use 500,000 for IMPALA and 50,000 for PPO.
-seed: we used 0 1 2 3 4 5 6... and plan to add more seeds.
-max: the maximum number of timesteps, we used 10,000,000 for IMPALA and 1,000,000 for PPO.

It should also be possible to adapt this code to run other ray tune schedulers. We used it for ASHA in our PPO experiments. We are also working to include a BOHB baseline.

Please get in touch for all questions. jackph [at] robots [dot] ox [dot] ac [dot] uk

Citing PB2

Finally, if you found this repo useful, please consider citing us:

@inproceedings{NEURIPS2020_c7af0926,
 author = {Parker-Holder, Jack and Nguyen, Vu and Roberts, Stephen J},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
 pages = {17200--17211},
 publisher = {Curran Associates, Inc.},
 title = {Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits},
 url = {https://proceedings.neurips.cc/paper/2020/file/c7af0926b294e47e52e46cfebe173f20-Paper.pdf},
 volume = {33},
 year = {2020}
}

Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Related tags

Overview

Population-Based Bandits (PB2)

Running the Code

Config

Citing PB2

Owner

Jack Parker-Holder

A machine learning project which can detect and predict the skin disease through image recognition.

Code for paper: Towards Tokenized Human Dynamics Representation

GrabGpu_py: a scripts for grab gpu when gpu is free

Learning Visual Words for Weakly-Supervised Semantic Segmentation

[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

Docker containers of baseline agents for the Crafter environment

A mini lib that implements several useful functions binding to PyTorch in C++.

Unofficial PyTorch implementation of Neural Additive Models (NAM) by Agarwal, et al.

百度2021年语言与智能技术竞赛机器阅读理解Pytorch版baseline

The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

Rename Images with Auto Generated Neural Image Captions

Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

Breast-Cancer-Prediction

Image based Human Fall Detection

Fine-tuning StyleGAN2 for Cartoon Face Generation

Equivariant GNN for the prediction of atomic multipoles up to quadrupoles.

Source code and notebooks to reproduce experiments and benchmarks on Bias Faces in the Wild (BFW).

PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

Constructing Neural Network-Based Models for Simulating Dynamical Systems