Dynamic Bottleneck for Robust Self-Supervised Exploration

Last update: Nov 14, 2022

Related tags

Deep Learning DB

Overview

Dynamic Bottleneck

Introduction

This is a TensorFlow based implementation for our paper on

"Dynamic Bottleneck for Robust Self-Supervised Exploration". NeurIPS 2021

Prerequisites

python3.6 or 3.7, tensorflow-gpu 1.x, tensorflow-probability, openAI baselines, openAI Gym

Installation and Usage

Atari games

The following command should train a pure exploration agent on "Breakout" with default experiment parameters.

python run.py --env BreakoutNoFrameskip-v4

Atari games with Random-Box noise

The following command should train a pure exploration agent on "Breakout" with randomBox noise.

python run.py --env BreakoutNoFrameskip-v4 --randomBoxNoise

Atari games with Gaussian noise

The following command should train a pure exploration agent on "Breakout" with Gaussian noise.

python run.py --env BreakoutNoFrameskip-v4 --pixelNoise

Atari games with sticky actions

The following command should train a pure exploration agent on "sticky Breakout" with a probability of 0.25

python run.py --env BreakoutNoFrameskip-v4 --stickyAtari

Baselines

ICM: We use the official code of "Curiosity-driven Exploration by Self-supervised Prediction, ICML 2017" and "Large-Scale Study of Curiosity-Driven Learning, ICLR 2019".
Disagreement: We use the official code of "Self-Supervised Exploration via Disagreement, ICML 2019".
CB: We use the official code of "Curiosity-Bottleneck: Exploration by Distilling Task-Specific Novelty, ICML 2019".

Dynamic Bottleneck for Robust Self-Supervised Exploration

Related tags

Overview

Dynamic Bottleneck

Introduction

Prerequisites

Installation and Usage

Atari games

Atari games with Random-Box noise

Atari games with Gaussian noise

Atari games with sticky actions

Baselines

Owner

Bai Chenjia

Introduction to CPM

某学校选课系统GIF验证码数据集 + Baseline模型 + 上下游相关工具

Build a medical knowledge graph based on Unified Language Medical System (UMLS)

Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

An interactive DNN Model deployed on web that predicts the chance of heart failure for a patient with an accuracy of 98%

Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Motion Reconstruction Code and Data for Skills from Videos (SFV)

VACA: Designing Variational Graph Autoencoders for Interventional and Counterfactual Queries

Free-duolingo-plus - Duolingo account creator that uses your invite code to get you free duolingo plus

Code for the CVPR 2021 paper: Understanding Failures of Deep Networks via Robust Feature Extraction

YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931)

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

A demo of how to use JAX to create a simple gravity simulation

Reproduces the results of the paper "Finite Basis Physics-Informed Neural Networks (FBPINNs): a scalable domain decomposition approach for solving differential equations".

Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Fake News Detection Using Machine Learning Methods

Bayesian optimisation library developped by Huawei Noah's Ark Library

Sample Code for "Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL"

MultiMix: Sparingly Supervised, Extreme Multitask Learning From Medical Images (ISBI 2021, MELBA 2021)