Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Last update: Jan 11, 2022

Related tags

Deep Learning c2d

Overview

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Code & Data Appendix for Conjugated Discrete Distributions for Distributional Reinforcement Learning.

Björn Lindenberg, Jonas Nordqvist, Karl-Olof Lindahl

Citation

If you use C2D in your research we ask you to please cite the following:

@misc{lindenberg2021conjugated,
      title={Conjugated Discrete Distributions for Distributional Reinforcement Learning}, 
      author={Björn Lindenberg and Jonas Nordqvist and Karl-Olof Lindahl},
      year={2021},
      eprint={2112.07424},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Data

Agent scores are available in the data folder.
Raw experiment data for each seed is available in the folder data/supplementary.
Each seed was run on a VM Ubuntu 20.04 server with 64GB RAM, a single Nvidia Quadro P4000 GPU and TensorFlow 2.5.

Code

The C++20 source code that handles ALE and transition buffering resides in src.
The agent code, written in TensorFlow/Python (with algorithms), can be viewed in c2d.
Requires cuDNN, TensorFlow 2.X, python3, The Arcade Learning Environment, C++20 and LZ4. For a comprehensive view of dependencies, have a look at our VM setup files in install_scripts.

Atari Games

To avoid legal issues, our Atari 2600 rom file directory ale_roms is left empty. However the corresponding binaries are widely available for import from elsewhere, e.g., Breakout or breakout.bin can be extracted from the atari-py Python package.

Library

The directory ale_roms needs to be populated by the relevant binaries of different Atari games. ALE's checksum file md5.txt for checking binary compatibility is present in the root directory.
The initial library setup or any changes to settings.cmake will require compilation by
```
bash build_lib.sh
```
One can train for one iteration (1M frames) in Breakout with:
```
python3 run.py --game breakout --tag test --iterations 1
```

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Related tags

Overview

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Citation

Data

Code

Atari Games

Library

Figures

Performance Profile (Deep reinforcement learning at the edge of the statistical precipice, Agarwal et al. 2021)

Sampling Efficiency: Mean and Median

Training Graphs

Strong/Weak Examples

Support Evolution

Owner

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

My implementation of Fully Convolutional Neural Networks in Keras

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

Supporting code for the paper "Dangers of Bayesian Model Averaging under Covariate Shift"

Contra is a lightweight, production ready Tensorflow alternative for solving time series prediction challenges with AI

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

Reimplementation of NeurIPS'19: "Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting" by Shu et al.

Normal Learning in Videos with Attention Prototype Network

This repository includes the official project for the paper: TransMix: Attend to Mix for Vision Transformers.

Fast Neural Representations for Direct Volume Rendering

3D-Transformer: Molecular Representation with Transformer in 3D Space

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)

Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016