Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

Last update: Jan 06, 2023

Overview

RIIT

Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standardize the hyperparameters of numerous QMIX variant algorithms that achieve SOTA.

Python MARL framework

PyMARL is WhiRL's framework for deep multi-agent reinforcement learning and includes implementations of the following algorithms:

Value-based Methods:

Actor Critic Methods:

PyMARL is written in PyTorch and uses SMAC as its environment.

Installation instructions

Install Python packages

# require Anaconda 3 or Miniconda 3
bash install_dependecies.sh

Set up StarCraft II and SMAC:

bash install_sc2.sh

This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.

Run an experiment

# For SMAC
python3 src/main.py --config=qmix --env-config=sc2 with env_args.map_name=corridor

# For Cooperative Predator-Prey
python3 src/main.py --config=qmix_prey --env-config=stag_hunt with env_args.map_name=stag_hunt

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

Run parallel experiments:

# bash run.sh config_name map_name_list (threads_num arg_list gpu_list experinments_num)
bash run.sh qmix corridor 2 epsilon_anneal_time=500000 0,1 5

xxx_list is separated by ,.

All results will be stored in the Results folder and named with map_name.

Force all trainning processes to exit

# all python and game processes of current user will quit.
bash clean.sh

Some test results on Super Hard scenarios

Cite

@article{hu2021riit,
      title={RIIT: Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning}, 
      author={Jian Hu and Siyang Jiang and Seth Austin Harding and Haibin Wu and Shih-wei Liao},
      year={2021},
      eprint={2102.03479},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

Related tags

Overview

RIIT

Python MARL framework

Installation instructions

Run an experiment

Run parallel experiments:

Force all trainning processes to exit

Some test results on Super Hard scenarios

Cite

Owner

Tensorflow 2 Object Detection API kurulumu, GPU desteği, custom model hazırlama

This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool

Differentiable scientific computing library

Optimizaciones incrementales al problema N-Body con el fin de evaluar y comparar las prestaciones de los traductores de Python en el ámbito de HPC.

Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation

SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs

Codes accompanying the paper "Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning" (NeurIPS 2021 Spotlight

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Multi-objective constrained optimization for energy applications via tree ensembles

Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

Object detection on multiple datasets with an automatically learned unified label space.

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

Official Code for "Non-deep Networks"

An implementation of the AdaOPS (Adaptive Online Packing-based Search), which is an online POMDP Solver used to solve problems defined with the POMDPs.jl generative interface.

Cross-modal Deep Face Normals with Deactivable Skip Connections

Some pre-commit hooks for OpenMMLab projects