NovelD: A Simple yet Effective Exploration Criterion

Intro

This is an implementation of the method proposed in

NovelD: A Simple yet Effective Exploration Criterion and BeBold: Exploration Beyond the Boundary of Explored Regions

Citation

If you use this code in your own work, please cite our paper:

@article{zhang2021noveld,
  title={NovelD: A Simple yet Effective Exploration Criterion},
  author={Zhang, Tianjun and Xu, Huazhe and Wang, Xiaolong and Wu, Yi and Keutzer, Kurt and Gonzalez, Joseph E and Tian, Yuandong},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

@article{zhang2020bebold,
  title={BeBold: Exploration Beyond the Boundary of Explored Regions},
  author={Zhang, Tianjun and Xu, Huazhe and Wang, Xiaolong and Wu, Yi and Keutzer, Kurt and Gonzalez, Joseph E and Tian, Yuandong},
  journal={arXiv preprint arXiv:2012.08621},
  year={2020}
}

Installation

# Install Instructions
conda create -n ride python=3.7
conda activate noveld 
git clone [email protected]:tianjunz/NovelD.git
cd NovelD
pip install -r requirements.txt

Train NovelD on MiniGrid

OMP_NUM_THREADS=1 python main.py --model bebold --env MiniGrid-ObstructedMaze-2Dlhb-v0 --total_frames 500000000 --intrinsic_reward_coef 0.05 --entropy_cost 0.0005

Acknowledgements

Our vanilla RL algorithm is based on RIDE.

License

This code is under the CC-BY-NC 4.0 (Attribution-NonCommercial 4.0 International) license.

NovelD: A Simple yet Effective Exploration Criterion

Related tags

Overview

NovelD: A Simple yet Effective Exploration Criterion

Intro

Citation

Installation

Train NovelD on MiniGrid

Acknowledgements

License

Owner

An end-to-end regression problem of predicting the price of properties in Bangalore.

IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling

SberSwap Video Swap base on deep learning

[SIGGRAPH 2021 Asia] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning

CROSS-LINGUAL ABILITY OF MULTILINGUAL BERT: AN EMPIRICAL STUDY

Benchmarking Pipeline for Prediction of Protein-Protein Interactions

Explore extreme compression for pre-trained language models

Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

Causal Imitative Model for Autonomous Driving

Dynamica causal Bayesian optimisation

Collection of NLP model explanations and accompanying analysis tools

PyDEns is a framework for solving Ordinary and Partial Differential Equations (ODEs & PDEs) using neural networks

SegNet model implemented using keras framework

Stereo Hybrid Event-Frame (SHEF) Cameras for 3D Perception, IROS 2021

CAST: Character labeling in Animation using Self-supervision by Tracking

Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

Code for How To Create A Fully Automated AI Based Trading System With Python

Learning to Reach Goals via Iterated Supervised Learning

AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation