PyTorch implementation of Munchausen Reinforcement Learning based on DQN and SAC. Handles discrete and continuous action spaces

Last update: Mar 10, 2022

Overview

Exploring Munchausen Reinforcement Learning

This is the project repository of my team in the "Advanced Deep Learning for Robotics" course at TUM. Our project's topic is "Exploring Munchausen Reinforcement Learning" based on this paper.

For a detailed discussion, see the report and the final presentation.

Setup

Create a virtual environment.
Run pip3 install -r requirements.txt

Code Structure

This repository is structured as follows:

The directories M-DQN and M-SAC contain the implementations of the RL agents DQN and SAC extended with the Munchausen term, respectively.
The directories rl-baselines3-zoo contains a copy of this repository, where we included the implementations of M-DQN so that we can easily train and test the M-DQN agent on benchmark environments and also compare it to other classical agents. To do so, just follow the steps described in the original repository and insert M-DQN as the agent argument.
The directory particles-envcontains a modified version of this repository. The modified version contains code for a particles environment, where an agent wants to reach a goal, while avoiding obstacles. Besides, M-SAC agent is implemented and included in the code, so that it can be trained and compared to the classical SAC agent.
The directory action-gap contains implementation of callbacks for experiment manager of rl-baselines3-zoo which logs action-gap for tensorboard.

PyTorch implementation of Munchausen Reinforcement Learning based on DQN and SAC. Handles discrete and continuous action spaces

Related tags

Overview

Exploring Munchausen Reinforcement Learning

Setup

Code Structure

Owner

Mohamed Amine Ketata

Implementation of the state-of-the-art vision transformers with tensorflow

A chemical analysis of lipophilicities & molecule drawings including ML

Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Clustering with variational Bayes and population Monte Carlo

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.

Open source hardware and software platform to build a small scale self driving car.

[CVPRW 2021] Code for Region-Adaptive Deformable Network for Image Quality Assessment

CC-GENERATOR - A python script for generating CC

It helps user to learn Pick-up lines and share if he has a better one

tsflex - feature-extraction benchmarking

Code for the AI lab course 2021/2022 of the University of Verona

[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning

PyTorch implementation for paper StARformer: Transformer with State-Action-Reward Representations.

Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 2021

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

Survival analysis (SA) is a well-known statistical technique for the study of temporal events.

Real-time Object Detection for Streaming Perception, CVPR 2022