Scalable Multi-Agent Reinforcement Learning

Overview

Scalable Multi-Agent Reinforcement Learning

1. Featured algorithms:

  • Value Function Factorization with Variable Agent Sub-Teams (VAST) [1]

2. Implemented domains

All available domains are listed in the table below. The labels are used for the commands below (in 5. and 6.).

Domain Label Description
Warehouse[4] Warehouse-4 Warehouse domain with 4 agents in a 5x3 grid.
Warehouse[8] Warehouse-8 Warehouse domain with 8 agents in a 5x5 grid.
Warehouse[16] Warehouse-16 Warehouse domain with 16 agents in a 9x13 grid.
Battle[20] Battle-20 Battle domain with armies of 20 agents each in a 10x10 grid.
Battle[40] Battle-40 Battle domain with armies of 40 agents each in a 14x14 grid.
Battle[80] Battle-80 Battle domain with armies of 80 agents each in a 18x18 grid.
GaussianSqueeze[200] GaussianSqueeze-200 Gaussian squeeze domain 200 agents.
GaussianSqueeze[400] GaussianSqueeze-400 Gaussian squeeze domain 400 agents.
GaussianSqueeze[800] GaussianSqueeze-800 Gaussian squeeze domain 800 agents.

3. Implemented MARL algorithms

The reported MARL algorithms are listed in the tables below. The labels are used for the commands below (in 5. and 6.).

Baseline Label
IL IL
QMIX QMIX
QTRAN QTRAN
VAST(VFF operator) Label
VAST(IL) VAST-IL
VAST(VDN) VAST-VDN
VAST(QMIX) VAST-QMIX
VAST(QTRAN) VAST-QTRAN
VAST(assignment strategy) Label
VAST(Random) VAST-QTRAN-RANDOM
VAST(Fixed) VAST-QTRAN-FIXED
VAST(Spatial) VAST-QTRAN-SPATIAL
VAST(MetaGrad) VAST-QTRAN

4. Experiment parameters

The experiment parameters like the learning rate for training (params["learning_rate"]) or the number of episodes per epoch (params["episodes_per_epoch"]) are specified in settings.py. All other hyperparameters are set in the corresponding python modules in the package vast/controllers, where all final values as listed in the technical appendix are specified as default value.

All hyperparameters can be adjusted by setting their values via the params dictionary in settings.py.

5. Training

To train a MARL algorithm M (see tables in 3.) in domain D (see table in 2.) with compactness factor eta, run the following command:

python train.py M D eta

This command will create a folder with the name pattern output/N-agents_domain-D_subteams-S_M_datetime which contains the trained models (depending on the MARL algorithm).

train.sh is an example script for running all settings as specified in the paper.

6. Plotting

To generate plots for a particular domain D and evaluation mode E as presented in the paper, run the following command:

python plot.py M E

The command will load and display all the data of completed training runs that are stored in the folder which is specified in params["output_folder"] (see settings.py).

The evaluation mode E are specified in the table below:

Evaluation mode Label
VFF operator comparison F
State-of-the-art comparison S
Assignment strategy comparison A
Division diversity comparison D

7. Rendering

To render episodes of the Warehouse[N] or Battle[N] domain, set params["render_pygame"]=True in settings.py.

8. References

  • [1] T. Phan et al., "VAST: Value Function Factorization with Variable Agent Sub-Teams", in NeurIPS 2021
Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer This repository contains the PyTorch code for Evo-ViT. This work proposes a slow-fas

YifanXu 53 Dec 05, 2022
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

TL;DR Ignite is a high-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently. Click on the image to

4.2k Jan 01, 2023
Automatic tool focused on deriving metallicities of open clusters

metalcode Automatic tool focused on deriving metallicities of open clusters. Based on the method described in Pöhnl & Paunzen (2010, https://ui.adsabs

2 Dec 13, 2021
Video Matting via Consistency-Regularized Graph Neural Networks

Video Matting via Consistency-Regularized Graph Neural Networks Project Page | Real Data | Paper Installation Our code has been tested on Python 3.7,

41 Dec 26, 2022
Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes Setup virtualenv -p python3 venv source venv/bin/activate pip instal

Planet AI GmbH 9 May 20, 2022
Framework for evaluating ANNS algorithms on billion scale datasets.

Billion-Scale ANN http://big-ann-benchmarks.com/ Install The only prerequisite is Python (tested with 3.6) and Docker. Works with newer versions of Py

Harsha Vardhan Simhadri 132 Dec 24, 2022
Equivariant Imaging: Learning Beyond the Range Space

Equivariant Imaging: Learning Beyond the Range Space Equivariant Imaging: Learning Beyond the Range Space Dongdong Chen, Julián Tachella, Mike E. Davi

Dongdong Chen 46 Jan 01, 2023
Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis

Hierarchical Attention Mining (HAM) for weakly-supervised abnormality localization This is the official PyTorch implementation for the HAM method. Pap

Xi Ouyang 22 Jan 02, 2023
Automatic packaging of the open-composite libs for OvGME

OvGME Packager for OpenXR – OpenComposite for DCS Note This repository is currently unsupported and needs to be migrated to the upstream OpenComposite

12 Nov 03, 2022
This repository builds a basic vision transformer from scratch so that one beginner can understand the theory of vision transformer.

vision-transformer-from-scratch This repository includes several kinds of vision transformers from scratch so that one beginner can understand the the

1 Dec 24, 2021
Make differentially private training of transformers easy for everyone

private-transformers This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers. What is this? Why

Xuechen Li 73 Dec 28, 2022
PyTorch implementation of Trust Region Policy Optimization

PyTorch implementation of TRPO Try my implementation of PPO (aka newer better variant of TRPO), unless you need to you TRPO for some specific reasons.

Ilya Kostrikov 366 Nov 15, 2022
Neural Tangent Generalization Attacks (NTGA)

Neural Tangent Generalization Attacks (NTGA) ICML 2021 Video | Paper | Quickstart | Results | Unlearnable Datasets | Competitions | Citation Overview

Chia-Hung Yuan 34 Nov 25, 2022
Music source separation is a task to separate audio recordings into individual sources

Music Source Separation Music source separation is a task to separate audio recordings into individual sources. This repository is an PyTorch implmeme

Bytedance Inc. 958 Jan 03, 2023
PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

IIM - Crowd Localization This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is dev

tao han 91 Nov 10, 2022
Generalized Proximal Policy Optimization with Sample Reuse (GePPO)

Generalized Proximal Policy Optimization with Sample Reuse This repository is the official implementation of the reinforcement learning algorithm Gene

Jimmy Queeney 9 Nov 28, 2022
catch-22: CAnonical Time-series CHaracteristics

catch22 - CAnonical Time-series CHaracteristics About catch22 is a collection of 22 time-series features coded in C that can be run from Python, R, Ma

Carl H Lubba 229 Oct 21, 2022
Official repository for ABC-GAN

ABC-GAN The work represented in this repository is the result of a 14 week semesterthesis on photo-realistic image generation using generative adversa

IgorSusmelj 10 Jun 23, 2022
Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

counterfactual-tpp This is a repository containing code and real data for the paper Counterfactual Temporal Point Processes. Pre-requisites This code

Networks Learning 11 Dec 09, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022