Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Overview

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms

This repository contains implementations of various off-policy multi-agent reinforcement learning (MARL) algorithms.

Authors: Akash Velu and Chao Yu

Algorithms supported:

  • MADDPG (MLP and RNN)
  • MATD3 (MLP and RNN)
  • QMIX (MLP and RNN)
  • VDN (MLP and RNN)

Environments supported:

1. Usage

WARNING #1: by default all experiments assume a shared policy by all agents i.e. there is one neural network shared by all agents

WARNING #2: only QMIX and MADDPG are thoroughly tested; however,our VDN and MATD3 implementations make small modifications to QMIX and MADDPG, respectively. We display results using our implementation here.

All core code is located within the offpolicy folder. The algorithms/ subfolder contains algorithm-specific code for all methods. RMADDPG and RMATD3 refer to RNN implementationso of MADDPG and MATD3, and mQMIX and mVDN refer to MLP implementations of QMIX and VDN. We additionally support prioritized experience replay (PER).

  • The envs/ subfolder contains environment wrapper implementations for the MPEs and SMAC.

  • Code to perform training rollouts and policy updates are contained within the runner/ folder - there is a runner for each environment.

  • Executable scripts for training with default hyperparameters can be found in the scripts/ folder. The files are named in the following manner: train_algo_environment.sh. Within each file, the map name (in the case of SMAC and the MPEs) can be altered.

  • Python training scripts for each environment can be found in the scripts/train/ folder.

  • The config.py file contains relevant hyperparameter and env settings. Most hyperparameters are defaulted to the ones used in the paper; however, please refer to the appendix for a full list of hyperparameters used.

2. Installation

Here we give an example installation on CUDA == 10.1. For non-GPU & other CUDA version installation, please refer to the PyTorch website.

# create conda environment
conda create -n marl python==3.6.1
conda activate marl
pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html
# install on-policy package
cd on-policy
pip install -e .

Even though we provide requirement.txt, it may have redundancy. We recommend that the user try to install other required packages by running the code and finding which required package hasn't installed yet.

2.1 Install StarCraftII 4.10

unzip SC2.4.10.zip
# password is iagreetotheeula
echo "export SC2PATH=~/StarCraftII/" > ~/.bashrc

2.2 Install MPE

# install this package first
pip install seaborn

There are 3 Cooperative scenarios in MPE:

  • simple_spread
  • simple_speaker_listener, which is 'Comm' scenario in paper
  • simple_reference

3.Train

Here we use train_mpe_maddpg.sh as an example:

cd offpolicy/scripts
chmod +x ./train_mpe_maddpg.sh
./train_mpe_maddpg.sh

Local results are stored in subfold scripts/results. Note that we use Weights & Bias as the default visualization platform; to use Weights & Bias, please register and login to the platform first. More instructions for using Weights&Bias can be found in the official documentation. Adding the --use_wandb in command line or in the .sh file will use Tensorboard instead of Weights & Biases.

4. Results

Results for the performance of RMADDPG and QMIX on the Particle Envs and QMIX in SMAC are depicted here. These results are obtained using a normal (not prioitized) replay buffer.

Owner
This is a benchmark of popular multi-agent reinforcement learning algorithms & environments
Use .csv files to record, play and evaluate motion capture data.

Purpose These scripts allow you to record mocap data to, and play from .csv files. This approach facilitates parsing of body movement data in statisti

21 Dec 12, 2022
Open & Efficient for Framework for Aspect-based Sentiment Analysis

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis Fast & Low Memory requirement & Enhanced implementation of Local Context F

YangHeng 567 Jan 07, 2023
GPU-Accelerated Deep Learning Library in Python

Hebel GPU-Accelerated Deep Learning Library in Python Hebel is a library for deep learning with neural networks in Python using GPU acceleration with

Hannes Bretschneider 1.2k Dec 21, 2022
Kaggleship: Kaggle Notebooks

Kaggleship: Kaggle Notebooks This repository contains my Kaggle notebooks. They are generally about data science, machine learning, and deep learning.

Erfan Sobhaei 1 Jan 25, 2022
A Lightweight Experiment & Resource Monitoring Tool 📺

Lightweight Experiment & Resource Monitoring 📺 "Did I already run this experiment before? How many resources are currently available on my cluster?"

170 Dec 28, 2022
Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022
Pythonic particle-based (super-droplet) warm-rain/aqueous-chemistry cloud microphysics package with box, parcel & 1D/2D prescribed-flow examples in Python, Julia and Matlab

PySDM PySDM is a package for simulating the dynamics of population of particles. It is intended to serve as a building block for simulation systems mo

Atmospheric Cloud Simulation Group @ Jagiellonian University 32 Oct 18, 2022
PyTorch implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 13.4k Jan 08, 2023
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Ubisoft 76 Dec 30, 2022
The repository for freeCodeCamp's YouTube course, Algorithmic Trading in Python

Algorithmic Trading in Python This repository Course Outline Section 1: Algorithmic Trading Fundamentals What is Algorithmic Trading? The Differences

Nick McCullum 1.8k Jan 02, 2023
LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs This is the code for the LERP. Dataset The dataset used is MI

5 Jun 18, 2022
Convolutional Neural Networks

Darknet Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation. D

Joseph Redmon 23.7k Jan 05, 2023
YOLOX_AUDIO is an audio event detection model based on YOLOX

YOLOX_AUDIO is an audio event detection model based on YOLOX, an anchor-free version of YOLO. This repo is an implementated by PyTorch. Main goal of YOLOX_AUDIO is to detect and classify pre-defined

intflow Inc. 77 Dec 19, 2022
Pywonderland - A tour in the wonderland of math with python.

A Tour in the Wonderland of Math with Python A collection of python scripts for drawing beautiful figures and animating interesting algorithms in math

Zhao Liang 4.1k Jan 03, 2023
The 2nd place solution of 2021 google landmark retrieval on kaggle.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

229 Dec 13, 2022
Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

Accompanying code for the paper Sub-Cluster AdaCos: Learning Representations for Anomalous Sound Detection.

Kevin Wilkinghoff 6 Dec 01, 2022
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

Henghui Ding 143 Dec 23, 2022
A JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations

jaxdf - JAX-based Discretization Framework Overview | Example | Installation | Documentation ⚠️ This library is still in development. Breaking changes

UCL Biomedical Ultrasound Group 65 Dec 23, 2022
Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

1.4k Jan 05, 2023