A pytorch reprelication of the model-based reinforcement learning algorithm MBPO

Last update: Jan 05, 2023

Related tags

Overview

This is a re-implementation of the model-based RL algorithm MBPO in pytorch as described in the following paper: When to Trust Your Model: Model-Based Policy Optimization.

This code is based on a previous paper in the NeurIPS reproducibility challenge that reproduces the result with a tensorflow ensemble model but shows a significant drop in performance with a pytorch ensemble model. This code re-implements the ensemble dynamics model with pytorch and closes the gap.

Reproduced results

The comparison are done on two tasks while other tasks are not tested. But on the tested two tasks, the pytorch implementation achieves similar performance compared to the official tensorflow code.

Dependencies

MuJoCo 1.5 & MuJoCo 2.0

Usage

python main_mbpo.py --env_name 'Walker2d-v2' --num_epoch 300 --model_type 'pytorch'

python main_mbpo.py --env_name 'Hopper-v2' --num_epoch 300 --model_type 'pytorch'

Reference

Official tensorflow implementation: https://github.com/JannerM/mbpo
Code to the reproducibility challenge paper: https://github.com/jxu43/replication-mbpo

A pytorch reprelication of the model-based reinforcement learning algorithm MBPO

Related tags

Overview

Overview

Reproduced results

Dependencies

Usage

Reference

Owner

Xingyu Lin

Face Detection & Age Gender & Expression & Recognition

Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Generalized Decision Transformer for Offline Hindsight Information Matching

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Predicting Student Attentiveness using OpenCV

Pytorch implementation of NEGEV method. Paper: "Negative Evidence Matters in Interpretable Histology Image Classification".

Human Dynamics from Monocular Video with Dynamic Camera Movements

PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data

social humanoid robots with GPGPU and IoT

Lucid Sonic Dreams syncs GAN-generated visuals to music.

Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Keras udrl - Keras implementation of Upside Down Reinforcement Learning

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

RTSeg: Real-time Semantic Segmentation Comparative Study

Torch implementation of SegNet and deconvolutional network

Adaptive Denoising Training (ADT) for Recommendation.

This repository contains pre-trained models and some evaluation code for our paper Towards Unsupervised Dense Information Retrieval with Contrastive Learning

Stochastic Scene-Aware Motion Prediction

Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking