Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI'22)

Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

PyTorch implementation of normalizing flow models

TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

PyTorchMemTracer - Depict GPU memory footprint during DNN training of PyTorch

This tool converts a Nondeterministic Finite Automata (NFA) into a Deterministic Finite Automata (DFA)

3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces (ICCV 2021)

Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms

The 2nd place solution of 2021 google landmark retrieval on kaggle.

Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

An index of recommendation algorithms that are based on Graph Neural Networks.

PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

Code for "Diffusion is All You Need for Learning on Surfaces"

Discord bot-CTFD-Thread-Parser - Discord bot CTFD-Thread-Parser