Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

Last update: Nov 26, 2022

Related tags

Overview

NDQ: Learning Nearly Decomposable Value Functions with Communication Minimization

Note

This codebase accompanies paper Learning Nearly Decomposable Value Functions with Communication Minimization, and is based on PyMARL and SMAC codebases which are open-sourced.

The implementation of the following methods can also be found in this codebase, which are finished by the authors of PyMARL:

Build the Dockerfile using

cd docker
bash build.sh

Set up StarCraft II and SMAC:

bash install_sc2.sh

This will download SC2 into the 3rdparty folder and copy the maps necessary to run over.

The requirements.txt file can be used to install the necessary packages into a virtual environment (not recomended).

Run an experiment

The following command train NDQ on the didactic task hallway.

python3 src/main.py 
--config=categorical_qmix
--env-config=join1
with
env_args.n_agents=2
env_args.state_numbers=[6,6]
obs_last_action=False
comm_embed_dim=3
c_beta=0.1
comm_beta=1e-2
comm_entropy_beta=0.
batch_size_run=16
t_max=2e7
local_results_path=$DATA_PATH
is_cur_mu=True
is_rank_cut_mu=True
runner="parallel_x"
test_interval=100000

The config files act as defaults for an algorithm or environment.

They are all located in src/config. --config refers to the config files in src/config/algs --env-config refers to the config files in src/config/envs

To train NDQ on SC2 tasks, run the following command:

--config=categorical_qmix
--env-config=sc2
with
env_args.map_name=bane_vs_hM
env_args.sight_range=2
env_args.shoot_range=2
env_args.obs_all_health=False
env_args.obs_enemy_health=False
comm_embed_dim=3
c_beta=0.1
comm_beta=0.0001
comm_entropy_beta=0.0
batch_size_run=16
runner="parallel_x"

SMAC maps can be found in src/smac_plus/sc2_maps/.

All results will be stored in the Results folder.

Saving and loading learnt models

Saving models

You can save the learnt models to disk by setting save_model = True, which is set to False by default. The frequency of saving models can be adjusted using save_model_interval configuration. Models will be saved in the result directory, under the folder called models. The directory corresponding each run will contain models saved throughout the experiment, each within a folder corresponding to the number of timesteps passed since starting the learning process.

Loading models

Learnt models can be loaded using the checkpoint_path parameter, after which the learning will proceed from the corresponding timestep.

Watching StarCraft II replays

save_replay option allows saving replays of models which are loaded using checkpoint_path. Once the model is successfully loaded, test_nepisode number of episodes are run on the test mode and a .SC2Replay file is saved in the Replay directory of StarCraft II. Please make sure to use the episode runner if you wish to save a replay, i.e., runner=episode. The name of the saved replay file starts with the given env_args.save_replay_prefix (map_name if empty), followed by the current timestamp.

The saved replays can be watched by double-clicking on them or using the following command:

python -m pysc2.bin.play --norender --rgb_minimap_size 0 --replay NAME.SC2Replay

Note: Replays cannot be watched using the Linux version of StarCraft II. Please use either the Mac or Windows version of the StarCraft II client.

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

Related tags

Overview

NDQ: Learning Nearly Decomposable Value Functions with Communication Minimization

Note

Run an experiment

Saving and loading learnt models

Saving models

Loading models

Watching StarCraft II replays

Owner

Tonghan Wang

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

Music Classification: Beyond Supervised Learning, Towards Real-world Applications

Where-Got-Time - An NUS timetable generator which uses a genetic algorithm to optimise timetables to suit the needs of NUS students

Neural Scene Flow Fields using pytorch-lightning, with potential improvements

Quantized models with python

Sematic-Segmantation - Semantic Segmentation on MIT ADE20K dataset in PyTorch

TensorFlow CNN for fast style transfer

Unofficial implementation of the Involution operation from CVPR 2021

This tool uses Deep Learning to help you draw and write with your hand and webcam.

NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

The Python ensemble sampling toolkit for affine-invariant MCMC

SeqAttack: a framework for adversarial attacks on token classification models

Cancer Drug Response Prediction via a Hybrid Graph Convolutional Network

Byzantine-robust decentralized learning via self-centered clipping

Open source repository for the code accompanying the paper 'PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations'.

Keras documentation, hosted live at keras.io

The ICS Chat System project for NYU Shanghai Fall 2021

Replication of Pix2Seq with Pretrained Model