Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Last update: Dec 26, 2022

Related tags

Deep Learning DQN-tensorflow

Overview

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

This implementation contains:

Deep Q-network and Q-learning
Experience replay memory
- to reduce the correlations between consecutive updates
Network for Q-learning targets are fixed for intervals
- to reduce the correlations between target and predicted Q-values

Requirements

Python 2.7 or Python 3.3+
gym
tqdm
SciPy or OpenCV2
TensorFlow 0.12.0

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

Simple Results

Details of Breakout with model m2(red) for 30 hours using GTX 980 Ti.

Details of Breakout with model m3(red) for 30 hours using GTX 980 Ti.

Detailed Results

[1] Action-repeat (frame-skip) of 1, 2, and 4 without learning rate decay

[2] Action-repeat (frame-skip) of 1, 2, and 4 with learning rate decay

[1] & [2]

[3] Action-repeat of 4 for DQN (dark blue) Dueling DQN (dark green) DDQN (brown) Dueling DDQN (turquoise)

The current hyper parameters and gradient clipping are not implemented as it is in the paper.

[4] Distributed action-repeat (frame-skip) of 1 without learning rate decay

[5] Distributed action-repeat (frame-skip) of 4 without learning rate decay

References

License

MIT License.

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Related tags

Overview

Human-Level Control through Deep Reinforcement Learning

Requirements

Usage

Results

Simple Results

Detailed Results

References

License

Owner

Devsisters Corp.

A whale detector design for the Kaggle whale-detector challenge!

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Implicit Model Specialization through DAG-based Decentralized Federated Learning

GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications

Hummingbird compiles trained ML models into tensor computation for faster inference.

Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Implementation for paper LadderNet: Multi-path networks based on U-Net for medical image segmentation

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.

Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

MegEngine implementation of YOLOX

Contour-guided image completion with perceptual grouping (BMVC 2021 publication)

Code for the paper "PortraitNet: Real-time portrait segmentation network for mobile device" @ CAD&Graphics2019

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

CARL provides highly configurable contextual extensions to several well-known RL environments.

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".

Deep Markov Factor Analysis (NeurIPS2021)

FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.