MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

Last update: Oct 24, 2021

Related tags

Overview

offline-MBPO

This repository contains the code of a version of model-based RL algorithm MBPO, which is modified to perform in offline RL settings
Paper:When to trust your model: Model-based policy optimization
With much thanks, this code is based on Xingyu-Lin's easy-to-read pytorch implementation of MBPO

Requirements

See requirements.txt
The code depends on D4RL's environments and datasets
Only support hopper, walker, halfcheetah and ant environments right now (if you wish to evaluate in other environments, modify the termination function in predict_env.py)

Usage

Simply run

  
python main_mbpo.py --env_name=halfcheetah-medium-v0 --seed=1234

Or modify the script runalgo.sh, then

  
bash runalgo.sh

Owner

LxzGordon

UCAS PhD student.

GitHub Repository

PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids

PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids The electric grid is a key enabling infrastructure for the a

19 Jan 07, 2023

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

NIRPS-ETC Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph February 2

2 Sep 15, 2022

MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

Related tags

Overview

offline-MBPO

Requirements

Usage

Owner

LxzGordon

PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids

Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

CasualHealthcare's Pneumonia detection with Artificial Intelligence (Convolutional Neural Network)

NeRF Meta-Learning with PyTorch

Portfolio analytics for quants, written in Python

Additional code for Stable-baselines3 to load and upload models from the Hub.

Motion Reconstruction Code and Data for Skills from Videos (SFV)

A new test set for ImageNet

Faune proche - Retrieval of Faune-France data near a google maps location

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Dynamica causal Bayesian optimisation

Official PyTorch implementation of "Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient".

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

2021 National Underwater Robotics Vision Optics

Python interface for the DIGIT tactile sensor

Improved Fitness Optimization Landscapes for Sequence Design

WSDM2022 Challenge - Large scale temporal graph link prediction

Autonomous Perception: 3D Object Detection with Complex-YOLO

✅ How Robust are Fact Checking Systems on Colloquial Claims?. In NAACL-HLT, 2021.