Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Related tags

Deep Learningc2d
Overview

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Code & Data Appendix for Conjugated Discrete Distributions for Distributional Reinforcement Learning.

Björn Lindenberg, Jonas Nordqvist, Karl-Olof Lindahl

Citation

If you use C2D in your research we ask you to please cite the following:

@misc{lindenberg2021conjugated,
      title={Conjugated Discrete Distributions for Distributional Reinforcement Learning}, 
      author={Björn Lindenberg and Jonas Nordqvist and Karl-Olof Lindahl},
      year={2021},
      eprint={2112.07424},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Data

  • Agent scores are available in the data folder.
  • Raw experiment data for each seed is available in the folder data/supplementary.
  • Each seed was run on a VM Ubuntu 20.04 server with 64GB RAM, a single Nvidia Quadro P4000 GPU and TensorFlow 2.5.

Code

  • The C++20 source code that handles ALE and transition buffering resides in src.
  • The agent code, written in TensorFlow/Python (with algorithms), can be viewed in c2d.
  • Requires cuDNN, TensorFlow 2.X, python3, The Arcade Learning Environment, C++20 and LZ4. For a comprehensive view of dependencies, have a look at our VM setup files in install_scripts.

Atari Games

  • To avoid legal issues, our Atari 2600 rom file directory ale_roms is left empty. However the corresponding binaries are widely available for import from elsewhere, e.g., Breakout or breakout.bin can be extracted from the atari-py Python package.

Library

  • The directory ale_roms needs to be populated by the relevant binaries of different Atari games. ALE's checksum file md5.txt for checking binary compatibility is present in the root directory.

  • The initial library setup or any changes to settings.cmake will require compilation by

    bash build_lib.sh
    
  • One can train for one iteration (1M frames) in Breakout with:

    python3 run.py --game breakout --tag test --iterations 1
    

Figures

Performance Profile (Deep reinforcement learning at the edge of the statistical precipice, Agarwal et al. 2021)

Performance Profile Aggregate Metrics

Sampling Efficiency: Mean and Median

Mean Median

Training Graphs

All Games

Strong/Weak Examples

Support Evolution

Support

PyTea: PyTorch Tensor shape error analyzer

PyTea: PyTorch Tensor Shape Error Analyzer paper project page Requirements node.js = 12.x python = 3.8 z3-solver = 4.8 How to install and use # ins

ROPAS Lab. 240 Jan 02, 2023
Sum-Product Probabilistic Language

Sum-Product Probabilistic Language SPPL is a probabilistic programming language that delivers exact solutions to a broad range of probabilistic infere

MIT Probabilistic Computing Project 57 Nov 17, 2022
Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

TableParser Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at DS3 Lab 11 Dec 13, 2022

Pytorch Implementation for NeurIPS (oral) paper: Pixel Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation

Pixel-Level Cycle Association This is the Pytorch implementation of our NeurIPS 2020 Oral paper Pixel-Level Cycle Association: A New Perspective for D

87 Oct 19, 2022
TensorFlow implementation of ENet

TensorFlow-ENet TensorFlow implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation. This model was tested on th

Kwotsin 255 Oct 17, 2022
A MNIST-like fashion product database. Benchmark

Fashion-MNIST Table of Contents Why we made Fashion-MNIST Get the Data Usage Benchmark Visualization Contributing Contact Citing Fashion-MNIST License

Zalando Research 10.5k Jan 08, 2023
Meli Data Challenge 2021 - First Place Solution

My solution for the Meli Data Challenge 2021

Matias Moreyra 23 Mar 09, 2022
This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures using receptive field analysis (RFA) and create graph visualizations of your architecture.

ReceptiveFieldAnalysisToolbox This is RFA-Toolbox, a simple and easy-to-use library that allows you to optimize your neural network architectures usin

84 Nov 23, 2022
GNEE - GAT Neural Event Embeddings

GNEE - GAT Neural Event Embeddings This repository contains source code for the GNEE (GAT Neural Event Embeddings) method introduced in the paper: "Se

João Pedro Rodrigues Mattos 0 Sep 15, 2021
This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

neon_course This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework. For more information, see

Nervana 92 Jan 03, 2023
CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

CSKG: The CommonSense Knowledge Graph CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation: AT

USC ISI I2 85 Dec 12, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
Adjusting for Autocorrelated Errors in Neural Networks for Time Series

Adjusting for Autocorrelated Errors in Neural Networks for Time Series This repository is the official implementation of the paper "Adjusting for Auto

Fan-Keng Sun 51 Nov 05, 2022
A python library to artfully visualize Factorio Blueprints and an interactive web demo for using it.

Factorio Blueprint Visualizer I love the game Factorio and I really like the look of factories after growing for many hours or blueprints after tweaki

Piet Brömmel 124 Jan 07, 2023
Churn prediction

Churn-prediction Churn-prediction Data preprocessing:: Label encoder is used to normalize the categorical variable Data Transformation:: For each data

1 Sep 28, 2022
Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP"

DiLBERT Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP" Pretrained Model The pretrained model presented in the paper is

Kevin Roitero 2 Dec 15, 2022
AI pipelines for Nvidia Jetson Platform

Jetson Multicamera Pipelines Easy-to-use realtime CV/AI pipelines for Nvidia Jetson Platform. This project: Builds a typical multi-camera pipeline, i.

NVIDIA AI IOT 96 Dec 23, 2022
LBK 35 Dec 26, 2022
Create time-series datacubes for supervised machine learning with ICEYE SAR images.

ICEcube is a Python library intended to help organize SAR images and annotations for supervised machine learning applications. The library generates m

ICEYE Ltd 65 Jan 03, 2023
Anagram Generator in Python

Anagrams Generator This is a program for computing multiword anagrams. It makes no effort to come up with sentences that make sense; it only finds ana

Day Fundora 5 Nov 17, 2022