Weighted QMIX: Expanding Monotonic Value Function Factorisation

Related tags

Deep Learningwqmix
Overview

Weighted QMIX: Expanding Monotonic Value Function Factorisation (NeurIPS 2020)

Based on PyMARL (https://github.com/oxwhirl/pymarl/). Please refer to that repo for more documentation.

This repo contains the cleaned-up code that was used in "Weighted QMIX: Expanding Monotonic Value Function Factorisation" (https://arxiv.org/abs/2006.10800).

Included in this repo

In particular implementations for:

  • OW-QMIX
  • CW-QMIX
  • Versions of DDPG & SAC used in the paper

We thank the authors of "QPLEX: Duplex Dueling Multi-Agent Q-Learning" (https://arxiv.org/abs/2008.01062) for their implementation of QPLEX (https://github.com/wjh720/QPLEX/), whose implementation we used. The exact implementation we used is included in this repo.

Note that in the repository the naming of certain hyper-parameters and concepts is a little different to the paper:

  • α in the paper is w in the code
  • Optimistic Weighting (OW) is referred to as hysteretic_qmix

For all SMAC experiments we used SC2.4.6.2.69232 (not SC2.4.10). The underlying dynamics are sufficiently different that you cannot compare runs across the 2 versions!

The install_sc2.sh script will install SC2.4.6.2.69232.

Running experiments

The config files (src/config/algs/*.yaml) contain default hyper-parameters for the respective algorithms. These were changed when running the experiments for the paper (epsilon_anneal_time = 1000000 for the robustness to exploration experiments, and w=0.1 for the predator prey punishment experiments for instance). Please see the Appendix of the paper for the exact hyper-parameters used.

Set central_mixer=atten to get the modified mixing network architecture that was used for the final experiment on corridor in the paper.

As an example, to run the OW-QMIX on 3s5z with epsilon annealed over 1mil timesteps using docker:

bash run.sh $GPU python3 src/main.py --config=ow_qmix --env-config=sc2 with env_args.map_name=3s5z w=0.5 epsilon_anneal_time=1000000

Citing

Bibtex:

@inproceedings{rashid2020weighted,
  title={Weighted QMIX: Expanding Monotonic Value Function Factorisation},
  author={Rashid, Tabish and Farquhar, Gregory and Peng, Bei and Whiteson, Shimon},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}
Owner
whirl
Whiteson Research Lab
whirl
This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

Zhong Peixiang 35 Nov 17, 2022
Unicorn can be used for performance analyses of highly configurable systems with causal reasoning

Unicorn can be used for performance analyses of highly configurable systems with causal reasoning. Users or developers can query Unicorn for a performance task.

AISys Lab 27 Jan 05, 2023
Implementation of "RaScaNet: Learning Tiny Models by Raster-Scanning Image" from CVPR 2021.

RaScaNet: Learning Tiny Models by Raster-Scanning Images Deploying deep convolutional neural networks on ultra-low power systems is challenging, becau

SAIT (Samsung Advanced Institute of Technology) 5 Dec 26, 2022
ICSS - Interactive Continual Semantic Segmentation

Presentation This repository contains the code of our paper: Weakly-supervised c

Alteia 9 Jul 23, 2022
Analysis of Smiles through reservoir sampling & RDkit

Analysis of Smiles through reservoir sampling and machine learning (under development). This is a simple project that includes two Jupyter files for t

Aurimas A. Nausėdas 6 Aug 30, 2022
This reposityory contains the PyTorch implementation of our paper "Generative Dynamic Patch Attack".

Generative Dynamic Patch Attack This reposityory contains the PyTorch implementation of our paper "Generative Dynamic Patch Attack". Requirements PyTo

Xiang Li 8 Nov 17, 2022
This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems

Stability Audit This repository contains code used to audit the stability of personality predictions made by two algorithmic hiring systems, Humantic

Data, Responsibly 4 Oct 27, 2022
BLEURT is a metric for Natural Language Generation based on transfer learning.

BLEURT: a Transfer Learning-Based Metric for Natural Language Generation BLEURT is an evaluation metric for Natural Language Generation. It takes a pa

Google Research 492 Jan 05, 2023
Real life contra a deep learning project built using mediapipe and openc

real-life-contra Description A python script that translates the body movement into in game control. Welcome to all new real life contra a deep learni

Programminghut 7 Jan 26, 2022
Neural Caption Generator with Attention

Neural Caption Generator with Attention Tensorflow implementation of "Show

Taeksoo Kim 510 Nov 30, 2022
JittorVis - Visual understanding of deep learning models

JittorVis: Visual understanding of deep learning model JittorVis is an open-source library for understanding the inner workings of Jittor models by vi

thu-vis 182 Jan 06, 2023
Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

scalableMARL Scalable Reinforcement Learning Policies for Multi-Agent Control CD. Hsu, H. Jeong, GJ. Pappas, P. Chaudhari. "Scalable Reinforcement Lea

Christopher Hsu 17 Nov 17, 2022
A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision This is the repository for our Paper/Contribution to the WI2022 in Nürnber

Maximilian Harl 6 Jan 17, 2022
In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台 本项目是一个简易的人脸检测识别平台,提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js,后端采用 pytorch,

Xuhua Huang 5 Aug 02, 2022
TextureGAN in Pytorch

TextureGAN This code is our PyTorch implementation of TextureGAN [Project] [Arxiv] TextureGAN is a generative adversarial network conditioned on sketc

Patsorn 147 Dec 14, 2022
Tutorial to set up TensorFlow Object Detection API on the Raspberry Pi

A tutorial showing how to set up TensorFlow's Object Detection API on the Raspberry Pi

Evan 1.1k Dec 26, 2022
Breaching - Breaching privacy in federated learning scenarios for vision and text

Breaching - A Framework for Attacks against Privacy in Federated Learning This P

Jonas Geiping 139 Jan 03, 2023
Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022
Bayesian Optimization using GPflow

Note: This package is for use with GPFlow 1. For Bayesian optimization using GPFlow 2 please see Trieste, a joint effort with Secondmind. GPflowOpt GP

GPflow 257 Dec 26, 2022
BackgroundRemover lets you Remove Background from images and video with a simple command line interface

BackgroundRemover BackgroundRemover is a command line tool to remove background from video and image, made by nadermx to power https://BackgroundRemov

Johnathan Nader 1.7k Dec 30, 2022