Scalable Multi-Agent Reinforcement Learning

Overview

Scalable Multi-Agent Reinforcement Learning

1. Featured algorithms:

  • Value Function Factorization with Variable Agent Sub-Teams (VAST) [1]

2. Implemented domains

All available domains are listed in the table below. The labels are used for the commands below (in 5. and 6.).

Domain Label Description
Warehouse[4] Warehouse-4 Warehouse domain with 4 agents in a 5x3 grid.
Warehouse[8] Warehouse-8 Warehouse domain with 8 agents in a 5x5 grid.
Warehouse[16] Warehouse-16 Warehouse domain with 16 agents in a 9x13 grid.
Battle[20] Battle-20 Battle domain with armies of 20 agents each in a 10x10 grid.
Battle[40] Battle-40 Battle domain with armies of 40 agents each in a 14x14 grid.
Battle[80] Battle-80 Battle domain with armies of 80 agents each in a 18x18 grid.
GaussianSqueeze[200] GaussianSqueeze-200 Gaussian squeeze domain 200 agents.
GaussianSqueeze[400] GaussianSqueeze-400 Gaussian squeeze domain 400 agents.
GaussianSqueeze[800] GaussianSqueeze-800 Gaussian squeeze domain 800 agents.

3. Implemented MARL algorithms

The reported MARL algorithms are listed in the tables below. The labels are used for the commands below (in 5. and 6.).

Baseline Label
IL IL
QMIX QMIX
QTRAN QTRAN
VAST(VFF operator) Label
VAST(IL) VAST-IL
VAST(VDN) VAST-VDN
VAST(QMIX) VAST-QMIX
VAST(QTRAN) VAST-QTRAN
VAST(assignment strategy) Label
VAST(Random) VAST-QTRAN-RANDOM
VAST(Fixed) VAST-QTRAN-FIXED
VAST(Spatial) VAST-QTRAN-SPATIAL
VAST(MetaGrad) VAST-QTRAN

4. Experiment parameters

The experiment parameters like the learning rate for training (params["learning_rate"]) or the number of episodes per epoch (params["episodes_per_epoch"]) are specified in settings.py. All other hyperparameters are set in the corresponding python modules in the package vast/controllers, where all final values as listed in the technical appendix are specified as default value.

All hyperparameters can be adjusted by setting their values via the params dictionary in settings.py.

5. Training

To train a MARL algorithm M (see tables in 3.) in domain D (see table in 2.) with compactness factor eta, run the following command:

python train.py M D eta

This command will create a folder with the name pattern output/N-agents_domain-D_subteams-S_M_datetime which contains the trained models (depending on the MARL algorithm).

train.sh is an example script for running all settings as specified in the paper.

6. Plotting

To generate plots for a particular domain D and evaluation mode E as presented in the paper, run the following command:

python plot.py M E

The command will load and display all the data of completed training runs that are stored in the folder which is specified in params["output_folder"] (see settings.py).

The evaluation mode E are specified in the table below:

Evaluation mode Label
VFF operator comparison F
State-of-the-art comparison S
Assignment strategy comparison A
Division diversity comparison D

7. Rendering

To render episodes of the Warehouse[N] or Battle[N] domain, set params["render_pygame"]=True in settings.py.

8. References

  • [1] T. Phan et al., "VAST: Value Function Factorization with Variable Agent Sub-Teams", in NeurIPS 2021
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well i

0 Sep 06, 2022
HistoKT: Cross Knowledge Transfer in Computational Pathology

HistoKT: Cross Knowledge Transfer in Computational Pathology Exciting News! HistoKT has been accepted to ICASSP 2022. HistoKT: Cross Knowledge Transfe

Mahdi S. Hosseini 5 Jan 05, 2023
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
3D-Reconstruction 基于深度学习方法的单目多视图三维重建

基于深度学习方法的单目多视图三维重建 Part I 三维重建 代码:Part1 技术文档:[Markdown] [PDF] 原始图像:Original Images 点云结果:Point Cloud Results-1

HMT_Curo 19 Dec 26, 2022
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

Vision Transformer Pytorch reimplementation of Google's repository for the ViT model that was released with the paper An Image is Worth 16x16 Words: T

Eunkwang Jeon 1.4k Dec 28, 2022
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models This repository is the official implementation of the fol

DistributedML 41 Dec 06, 2022
Reinforcement learning for self-driving in a 3D simulation

SelfDrive_AI Reinforcement learning for self-driving in a 3D simulation (Created using UNITY-3D) 1. Requirements for the SelfDrive_AI Gym You need Pyt

Surajit Saikia 17 Dec 14, 2021
Visyerres sgdf woob - Modules Woob pour l'intranet et autres sites Scouts et Guides de France

Vis'Yerres SGDF - Modules Woob Vous avez le sentiment que l'intranet des Scouts

Thomas Touhey (pas un pseudonyme) 3 Dec 24, 2022
Public repository created to store my custom-made tools for Just Dance (UbiArt Engine)

Woody's Just Dance Tools Public repository created to store my custom-made tools for Just Dance (UbiArt Engine) Development and updates Almost all of

Wodson de Andrade 8 Dec 24, 2022
Fine-tune pretrained Convolutional Neural Networks with PyTorch

Fine-tune pretrained Convolutional Neural Networks with PyTorch. Features Gives access to the most popular CNN architectures pretrained on ImageNet. A

Alex Parinov 694 Nov 23, 2022
Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Price-Prediction-For-a-Dream-Home ROADMAP TO THIS LINEAR REGRESSION BASED HOUSE PRICE PREDICTION PREDICTION MODEL Import all the dependencies of the p

DIKSHA DESWAL 1 Dec 29, 2021
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

Healthcare Intelligence Laboratory 1.3k Jan 03, 2023
Open-source Monocular Python HawkEye for Tennis

Tennis Tracking 🎾 Objectives Track the ball Detect court lines Detect the players To track the ball we used TrackNet - deep learning network for trac

ArtLabs 188 Jan 08, 2023
'Solving the sampling problem of the Sycamore quantum supremacy circuits

solve_sycamore This repo contains data, contraction code, and contraction order for the paper ''Solving the sampling problem of the Sycamore quantum s

Feng Pan 29 Nov 28, 2022
Google AI Open Images - Object Detection Track: Open Solution

Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c

minerva.ml 46 Jun 22, 2022
Corgis are the cutest creatures; have 30K of them!

corgi-net This is a dataset of corgi images scraped from the corgi subreddit. After filtering using an ImageNet classifier, the training set consists

Alex Nichol 6 Dec 24, 2022
ANEA: Distant Supervision for Low-Resource Named Entity Recognition

ANEA: Distant Supervision for Low-Resource Named Entity Recognition ANEA is a tool to automatically annotate named entities in unlabeled text based on

Saarland University Spoken Language Systems Group 15 Mar 30, 2022
PyTorch implementation of ICLR 2022 paper PiCO: Contrastive Label Disambiguation for Partial Label Learning

PiCO: Contrastive Label Disambiguation for Partial Label Learning This is a PyTorch implementation of ICLR 2022 Oral paper PiCO; also see our Project

王皓波 147 Jan 07, 2023
Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"

Reproducibility materials for "Importance of Kernel Bandwidth in Quantum Machine Learning" Repo structure: code contains Python scripts used to genera

Ruslan Shaydulin 3 Oct 23, 2022
The self-supervised goal reaching benchmark introduced in Discovering and Achieving Goals via World Models

Lexa-Benchmark Codebase for the self-supervised goal reaching benchmark introduced in 'Discovering and Achieving Goals via World Models'. Setup Create

1 Oct 14, 2021