Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

A novel benchmark dataset for Monocular Layout prediction

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Data Augmentation Using Keras and Python

Space Time Recurrent Memory Network - Pytorch

The end-to-end platform for building voice products at scale

Trajectory Variational Autoencder baseline for Multi-Agent Behavior challenge 2022

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

The official repository for BaMBNet

A set of tools for converting a darknet dataset to COCO format working with YOLOX

Code for the paper Open Sesame: Getting Inside BERT's Linguistic Knowledge.

Nonnegative spatial factorization for multivariate count data

PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation (TPAMI).

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

This code is a near-infrared spectrum modeling method based on PCA and pls

PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data

A simple code to perform canny edge contrast detection on images.

This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners.

Code for the Image similarity challenge.

Pytorch Lightning Implementation of SC-Depth Methods.