Scalable Multi-Agent Reinforcement Learning

Overview

Scalable Multi-Agent Reinforcement Learning

1. Featured algorithms:

  • Value Function Factorization with Variable Agent Sub-Teams (VAST) [1]

2. Implemented domains

All available domains are listed in the table below. The labels are used for the commands below (in 5. and 6.).

Domain Label Description
Warehouse[4] Warehouse-4 Warehouse domain with 4 agents in a 5x3 grid.
Warehouse[8] Warehouse-8 Warehouse domain with 8 agents in a 5x5 grid.
Warehouse[16] Warehouse-16 Warehouse domain with 16 agents in a 9x13 grid.
Battle[20] Battle-20 Battle domain with armies of 20 agents each in a 10x10 grid.
Battle[40] Battle-40 Battle domain with armies of 40 agents each in a 14x14 grid.
Battle[80] Battle-80 Battle domain with armies of 80 agents each in a 18x18 grid.
GaussianSqueeze[200] GaussianSqueeze-200 Gaussian squeeze domain 200 agents.
GaussianSqueeze[400] GaussianSqueeze-400 Gaussian squeeze domain 400 agents.
GaussianSqueeze[800] GaussianSqueeze-800 Gaussian squeeze domain 800 agents.

3. Implemented MARL algorithms

The reported MARL algorithms are listed in the tables below. The labels are used for the commands below (in 5. and 6.).

Baseline Label
IL IL
QMIX QMIX
QTRAN QTRAN
VAST(VFF operator) Label
VAST(IL) VAST-IL
VAST(VDN) VAST-VDN
VAST(QMIX) VAST-QMIX
VAST(QTRAN) VAST-QTRAN
VAST(assignment strategy) Label
VAST(Random) VAST-QTRAN-RANDOM
VAST(Fixed) VAST-QTRAN-FIXED
VAST(Spatial) VAST-QTRAN-SPATIAL
VAST(MetaGrad) VAST-QTRAN

4. Experiment parameters

The experiment parameters like the learning rate for training (params["learning_rate"]) or the number of episodes per epoch (params["episodes_per_epoch"]) are specified in settings.py. All other hyperparameters are set in the corresponding python modules in the package vast/controllers, where all final values as listed in the technical appendix are specified as default value.

All hyperparameters can be adjusted by setting their values via the params dictionary in settings.py.

5. Training

To train a MARL algorithm M (see tables in 3.) in domain D (see table in 2.) with compactness factor eta, run the following command:

python train.py M D eta

This command will create a folder with the name pattern output/N-agents_domain-D_subteams-S_M_datetime which contains the trained models (depending on the MARL algorithm).

train.sh is an example script for running all settings as specified in the paper.

6. Plotting

To generate plots for a particular domain D and evaluation mode E as presented in the paper, run the following command:

python plot.py M E

The command will load and display all the data of completed training runs that are stored in the folder which is specified in params["output_folder"] (see settings.py).

The evaluation mode E are specified in the table below:

Evaluation mode Label
VFF operator comparison F
State-of-the-art comparison S
Assignment strategy comparison A
Division diversity comparison D

7. Rendering

To render episodes of the Warehouse[N] or Battle[N] domain, set params["render_pygame"]=True in settings.py.

8. References

  • [1] T. Phan et al., "VAST: Value Function Factorization with Variable Agent Sub-Teams", in NeurIPS 2021
ZEBRA: Zero Evidence Biometric Recognition Assessment

ZEBRA: Zero Evidence Biometric Recognition Assessment license: LGPLv3 - please reference our paper version: 2020-06-11 author: Andreas Nautsch (EURECO

Voice Privacy Challenge 2 Dec 12, 2021
Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Learning to Execute (L2E) Official code base for completely reproducing all results reported in I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learnin

3 May 18, 2022
Romanian Automatic Speech Recognition from the ROBIN project

RobinASR This repository contains Robin's Automatic Speech Recognition (RobinASR) for the Romanian language based on the DeepSpeech2 architecture, tog

RACAI 10 Jan 01, 2023
This is the code of NeurIPS'21 paper "Towards Enabling Meta-Learning from Target Models".

ST This is the code of NeurIPS 2021 paper "Towards Enabling Meta-Learning from Target Models". If you use any content of this repo for your work, plea

Su Lu 7 Dec 06, 2022
The official PyTorch implementation for NCSNv2 (NeurIPS 2020)

Improved Techniques for Training Score-Based Generative Models This repo contains the official implementation for the paper Improved Techniques for Tr

174 Dec 26, 2022
Determined: Deep Learning Training Platform

Determined: Deep Learning Training Platform Determined is an open-source deep learning training platform that makes building models fast and easy. Det

Determined AI 2k Dec 31, 2022
Code for paper "Which Training Methods for GANs do actually Converge? (ICML 2018)"

GAN stability This repository contains the experiments in the supplementary material for the paper Which Training Methods for GANs do actually Converg

Lars Mescheder 885 Jan 01, 2023
DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

DumpSMBShare A script to dump files and folders remotely from a Windows SMB shar

Podalirius 178 Jan 06, 2023
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration

This repo is for the paper: Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration The DAC environment is based on the Dynam

Carola Doerr 1 Aug 19, 2022
This program can detect your face and add an Christams hat on the top of your head

Auto_Christmas This program can detect your face and add a Christmas hat to the top of your head. just run the Auto_Christmas.py, then you can see the

3 Dec 22, 2021
Multi-Modal Machine Learning toolkit based on PyTorch.

简体中文 | English TorchMM 简介 多模态学习工具包 TorchMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 TorchMM 初始版本 v1.0 特性 丰富的任务场景:工具

njustkmg 1 Jan 05, 2022
This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

JigsawClustering Jigsaw Clustering for Unsupervised Visual Representation Learning Pengguang Chen, Shu Liu, Jiaya Jia Introduction This project provid

DV Lab 73 Sep 18, 2022
PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA

Soft DTW Loss Function for PyTorch in CUDA This is a Pytorch Implementation of Soft-DTW: a Differentiable Loss Function for Time-Series which is batch

Keon Lee 76 Dec 20, 2022
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

UMS for Multi-turn Response Selection Implements the model described in the following paper Do Response Selection Models Really Know What's Next? Utte

Taesun Whang 47 Nov 22, 2022
This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

SqueezeNet-Implementation This repository attempts to replicate the SqueezeNet architecture using TensorFlow discussed in the research paper: "Squeeze

Rohan Mathur 3 Dec 13, 2022
An Unsupervised Detection Framework for Chinese Jargons in the Darknet

An Unsupervised Detection Framework for Chinese Jargons in the Darknet This repo is the Python 3 implementation of 《An Unsupervised Detection Framewor

7 Nov 08, 2022
A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Logical Neural Networks LNNs are a novel Neuro = symbolic framework designed to seamlessly provide key properties of both neural nets (learning) and s

International Business Machines 138 Dec 19, 2022
This code finds bounding box of a single human mouth.

This code finds bounding box of a single human mouth. In comparison to other face segmentation methods, it is relatively insusceptible to open mouth conditions, e.g., yawning, surgical robots, etc. T

iThermAI 4 Nov 27, 2022
Denoising Diffusion Implicit Models

Denoising Diffusion Implicit Models (DDIM) Jiaming Song, Chenlin Meng and Stefano Ermon, Stanford Implements sampling from an implicit model that is t

465 Jan 05, 2023