Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Last update: Nov 11, 2022

Related tags

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Move to the state-SAC-LFF repository.

cd state-SAC-LFF

To install the dependencies, use the provided environment.yml file

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 3
python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 2 \
               --network_class FourierMLP --sigma 0.001 --fourier_dim 1024 --train_B --concatenate_fourier

The only thing that changes between the baseline is the number of hidden layers (we reduce by 1 to keep parameter count roughly the same), the network_class, the fourier_dim, sigma, train_B, and concatenate_fourier.

Image-space Soft Actor-Critic Experiments

Move to the image-SAC-LFF repository.

cd image-SAC-LFF

Install RAD dependencies:

conda env create -f conda_env.yml

To run an experiment, the template for CNN and CNN+LFF experiments, respectively, are:

python train.py --domain_name hopper --task_name hop --encoder_type fourier_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000 --fourier_dim 128 --sigma 0.1 --train_B --concatenate_fourier
python train.py --domain_name hopper --task_name hop --encoder_type fair_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000

Proximal Policy Optimization Experiments

Move to the state-PPO-LFF repository.

cd pytorch-a2c-ppo-acktr-gail

Install PPO dependencies:

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class MLP --n_hidden 2 --seed 10
python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class FourierMLP --n_hidden 2 --sigma 0.01 --fourier_dim 64 \ 
               --concatenate_fourier --train_B --seed 10

Acknowledgements

We built the state-based SAC codebase off the TD3 repo by Fujimoto et al. We especially appreciated its lightweight bare-bones training loop. For the state-based SAC algorithm implementation and hyperparameters, we used this PyTorch SAC repo by Yarats and Kostrikov. For the SAC+RAD image-based experiments, we used the authors' implementation. Finally, we built off this PPO codebase by Ilya Kostrikov.

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Related tags

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Image-space Soft Actor-Critic Experiments

Proximal Policy Optimization Experiments

Acknowledgements

Owner

Alex Li

Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

HNECV: Heterogeneous Network Embedding via Cloud model and Variational inference

Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

Vision transformers (ViTs) have found only limited practical use in processing images

Lightweight tool to perform MITM attack on local network

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

A MatConvNet-based implementation of the Fully-Convolutional Networks for image segmentation

Codes to calculate solar-sensor zenith and azimuth angles directly from hyperspectral images collected by UAV. Works only for UAVs that have high resolution GNSS/IMU unit.

QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

An official TensorFlow implementation of “CLCC: Contrastive Learning for Color Constancy” accepted at CVPR 2021.

Tooling for converting STAC metadata to ODC data model

A Pytorch Implementation for Compact Bilinear Pooling.

Multi-Stage Progressive Image Restoration

Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

Fastquant - Backtest and optimize your trading strategies with only 3 lines of code!

Adversarial Graph Augmentation to Improve Graph Contrastive Learning

From the basics to slightly more interesting applications of Tensorflow

[NeurIPS 2021] Low-Rank Subspaces in GANs

MoveNetを用いたPythonでの姿勢推定のデモ