PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Last update: Dec 08, 2022

Overview

Advantage async actor-critic Algorithms (A3C) in PyTorch

@inproceedings{mnih2016asynchronous,
  title={Asynchronous methods for deep reinforcement learning},
  author={Mnih, Volodymyr and Badia, Adria Puigdomenech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P and Harley, Tim and Silver, David and Kavukcuoglu, Koray},
  booktitle={International Conference on Machine Learning},
  year={2016}}

This repository contains an implementation of Adavantage async Actor-Critic (A3C) in PyTorch based on the original paper by the authors and the PyTorch implementation by Ilya Kostrikov.

A3C is the state-of-art Deep Reinforcement Learning method.

Dependencies

Python 2.7
PyTorch
gym (OpenAI)
universe (OpenAI)
opencv (for env state processing)
visdom (for visualization)

Training

./train_lstm.sh

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

./test_lstm.sh 169000

A test result video is available.

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Related tags

Overview

Advantage async actor-critic Algorithms (A3C) in PyTorch

Dependencies

Training

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

Check the loss curves of all threads in http://localhost:8097

References

Owner

LEI TAI

Codebase of deep learning models for inferring stability of mRNA molecules

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Towards Representation Learning for Atmospheric Dynamics (AtmoDist)

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

A TensorFlow 2.x implementation of Masked Autoencoders Are Scalable Vision Learners

This is the official implementation for the paper "Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization" in NeurIPS 2021.

This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations,

Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch.

MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'

Implementation of popular SOTA self-supervised learning algorithms as Fastai Callbacks.

Code for the upcoming CVPR 2021 paper

This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Repository of Vision Transformer with Deformable Attention

Labelbox is the fastest way to annotate data to build and ship artificial intelligence applications

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

A simplistic and efficient pure-python neural network library from Phys Whiz with CPU and GPU support.