TensorFlow implementation of Deep Reinforcement Learning papers

Last update: Jan 03, 2023

Overview

Deep Reinforcement Learning in TensorFlow

TensorFlow implementation of Deep Reinforcement Learning papers. This implementation contains:

[1] Playing Atari with Deep Reinforcement Learning
[2] Human-Level Control through Deep Reinforcement Learning
[3] Deep Reinforcement Learning with Double Q-learning
[4] Dueling Network Architectures for Deep Reinforcement Learning
[5] Prioritized Experience Replay (in progress)
[6] Deep Exploration via Bootstrapped DQN (in progress)
[7] Asynchronous Methods for Deep Reinforcement Learning (in progress)
[8] Continuous Deep q-Learning with Model-based Acceleration (in progress)

Requirements

Usage

First, install prerequisites with:

$ pip install -U 'gym[all]' tqdm scipy

Don't forget to also install the latest TensorFlow. Also note that you need to install the dependences of doom-py which is required by gym[all]

Train with DQN model described in [1] without gpu:

$ python main.py --network_header_type=nips --env_name=Breakout-v0 --use_gpu=False

Train with DQN model described in [2]:

$ python main.py --network_header_type=nature --env_name=Breakout-v0

Train with Double DQN model described in [3]:

$ python main.py --double_q=True --env_name=Breakout-v0

Train with Deuling network with Double Q-learning described in [4]:

$ python main.py --double_q=True --network_output_type=dueling --env_name=Breakout-v0

Train with MLP model described in [4] with corridor environment (useful for debugging):

$ python main.py --network_header_type=mlp --network_output_type=normal --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=normal --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=dueling --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=dueling --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025

Results

Result of Corridor-v5 in [4] for DQN (purple), DDQN (red), Dueling DQN (green), Dueling DDQN (blue).

Result of `Breakout-v0' for DQN without frame-skip (white-blue), DQN with frame-skip (light purple), Dueling DDQN (dark blue).

The hyperparameters and gradient clipping are not implemented as it is as [4].

References

Author

Taehoon Kim / @carpedm20

TensorFlow implementation of Deep Reinforcement Learning papers

Related tags

Overview

Deep Reinforcement Learning in TensorFlow

Requirements

Usage

Results

References

Author

Owner

Taehoon Kim

GULAG: GUessing LAnGuages with neural networks

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Apply a perspective transformation to a raster image inside Inkscape (no need to use an external software such as GIMP or Krita).

simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

The 3rd place solution for competition

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

Progressive Image Deraining Networks: A Better and Simpler Baseline

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

competitions-v2

CountDown to New Year and shoot fireworks

LaBERT - A length-controllable and non-autoregressive image captioning model.

Code for the paper "There is no Double-Descent in Random Forests"

a basic code repository for basic task in CV(classification,detection,segmentation)

Implementation of PersonaGPT Dialog Model

TensorFlow implementation of Deep Reinforcement Learning papers

Related tags

Overview

Deep Reinforcement Learning in TensorFlow

Requirements

Usage

Results

References

Author

Owner

Taehoon Kim

GULAG: GUessing LAnGuages with neural networks

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for *Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances* paper.

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Apply a perspective transformation to a raster image inside Inkscape (no need to use an external software such as GIMP or Krita).

simple_pytorch_example project is a toy example of a python script that instantiates and trains a PyTorch neural network on the FashionMNIST dataset

The 3rd place solution for competition

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

Official Implementation of LARGE: Latent-Based Regression through GAN Semantics

Progressive Image Deraining Networks: A Better and Simpler Baseline

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory

competitions-v2

CountDown to New Year and shoot fireworks

LaBERT - A length-controllable and non-autoregressive image captioning model.

Code for the paper "There is no Double-Descent in Random Forests"

a basic code repository for basic task in CV(classification,detection,segmentation)

Implementation of PersonaGPT Dialog Model

⚖️🔁🔮🕵️‍♂️🦹🖼️ Code for Measuring the Contribution of Multiple Model Representations in Detecting Adversarial Instances paper.