Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Related tags

Deep LearningCAP
Overview

Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

This is the official repository for Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning. We provide the commands to run the PETS and PlaNet experiments included in the paper. This repository is made minimal for ease of experimentation.

Installations

This repository requires Python (3.6), Pytorch (version 1.3 or above) run the following command to create a conda environment (tested using CUDA10.2):

conda env create -f environment.yml

Experiments

To run the PETS experiments on the HalfCheetah environment used in our ablation study, run:

cd cap-pets

CAP

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --learn_kappa --seed 1

CAP with fixed kappa

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --penalize_uncertainty --kappa 1.0 --seed 1

CCEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--cost_constrained --seed 1

CEM

python cap-pets/run_cap_pets.py --algo cem --env HalfCheetah-v3 --cost_lim 152 \
--seed 1

The commands for the PlaNet experiment on the CarRacing environment are:

CAP

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--learn-kappa --penalty-kappa 0.1 \
--id CarRacing-cap --seed 1

CAP with fixed kappa

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained --penalize-uncertainty \
--penalty-kappa 1.0 \
--id CarRacing-kappa1 --seed 1

CCEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--cost-constrained \
--id CarRacing-ccem --seed 1

CEM

python cap-planet/run_cap_planet.py --env CarRacingSkiddingConstrained-v0 \
--cost-limit 0 --binary-cost \
--id CarRacing-cem --seed 1

Contact

If you have any questions regarding the code or paper, feel free to contact [email protected] or open an issue on this repository.

Acknowledgement

This repository contains code adapted from the following repositories: PETS and PlaNet. We thank the authors and contributors for open-sourcing their code.

Owner
Undergraduate student at University of Melbourne, interested in Machine Learning
Deep Learning Pipelines for Apache Spark

Deep Learning Pipelines for Apache Spark The repo only contains HorovodRunner code for local CI and API docs. To use HorovodRunner for distributed tra

Databricks 2k Jan 08, 2023
Supercharging Imbalanced Data Learning WithCausal Representation Transfer

ECRT: Energy-based Causal Representation Transfer Code for Supercharging Imbalanced Data Learning With Energy-basedContrastive Representation Transfer

Zidi Xiu 11 May 02, 2022
Training deep models using anime, illustration images.

animeface deep models for anime images. Datasets anime-face-dataset Anime faces collected from Getchu.com. Based on Mckinsey666's dataset. 63.6K image

Tomoya Sawada 61 Dec 25, 2022
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

Nate Veldt 1 Nov 25, 2022
Deep Learning for Morphological Profiling

Deep Learning for Morphological Profiling An end-to-end implementation of a ML System for morphological profiling using self-supervised learning to di

Danielh Carranza 0 Jan 20, 2022
[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration This repository is the official PyTorch implementation of UOT

6 Jun 29, 2022
GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!

GEP (GDB Enhanced Prompt) GEP (GDB Enhanced Prompt) is a GDB plug-in which make your GDB command prompt more convenient and flexibility. Why I need th

Alan Li 23 Dec 21, 2022
Autolfads-tf2 - A TensorFlow 2.0 implementation of Latent Factor Analysis via Dynamical Systems (LFADS) and AutoLFADS

autolfads-tf2 A TensorFlow 2.0 implementation of LFADS and AutoLFADS. Installati

Systems Neural Engineering Lab 11 Oct 29, 2022
Code for the KDD 2021 paper 'Filtration Curves for Graph Representation'

Filtration Curves for Graph Representation This repository provides the code from the KDD'21 paper Filtration Curves for Graph Representation. Depende

Machine Learning and Computational Biology Lab 16 Oct 16, 2022
Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI

EmotionUI Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI. demo screenshot (with RealSense) required packages Python = 3.6 num

Yang Jiao 2 Dec 23, 2021
👐OpenHands : Making Sign Language Recognition Accessible (WiP 🚧👷‍♂️🏗)

👐 OpenHands: Sign Language Recognition Library Making Sign Language Recognition Accessible Check the documentation on how to use the library: ReadThe

AI4Bhārat 69 Dec 12, 2022
Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

Set of methods to ensemble boxes from different object detection models, including implementation of "Weighted boxes fusion (WBF)" method.

1.4k Jan 05, 2023
Code accompanying the paper "Knowledge Base Completion Meets Transfer Learning"

Knowledge Base Completion Meets Transfer Learning This code accompanies the paper Knowledge Base Completion Meets Transfer Learning published at EMNLP

14 Nov 27, 2022
No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

No-Reference Image Quality Assessment Algorithms No-reference Image Quality Assessment(NIQA) is a task of evaluating an image without a reference imag

Dae-Young Song 26 Jan 04, 2023
A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

MADGRAD Optimization Algorithm For Tensorflow This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized,

20 Aug 18, 2022
FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

FusionNet_Pytorch FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics Requirements Pytorch 0.1.11 Pyt

Choi Gunho 102 Dec 13, 2022
A minimalist environment for decision-making in autonomous driving

highway-env A collection of environments for autonomous driving and tactical decision-making tasks An episode of one of the environments available in

Edouard Leurent 1.6k Jan 07, 2023
Non-Official Pytorch implementation of "Face Identity Disentanglement via Latent Space Mapping" https://arxiv.org/abs/2005.07728 Using StyleGAN2 instead of StyleGAN

Face Identity Disentanglement via Latent Space Mapping - Implement in pytorch with StyleGAN 2 Description Pytorch implementation of the paper Face Ide

Daniel Roich 58 Dec 24, 2022
Neural network graphs and training metrics for PyTorch, Tensorflow, and Keras.

HiddenLayer A lightweight library for neural network graphs and training metrics for PyTorch, Tensorflow, and Keras. HiddenLayer is simple, easy to ex

Waleed 1.7k Dec 31, 2022