RL algorithm PPO and IRL algorithm AIRL written with Tensorflow.

Last update: Dec 28, 2021

Related tags

Deep Learning PPO-and-AIRL-with-parallel-sampling

Overview

Key packages verison

numpy==1.16
tensorflow==1.14
gym==0.15.4
ray==1.2

What can this repository do

Reinforcement learning algorithm PPO, with parallel sampling, continous/discrete action space
Inverse reinforcement learning algorithm AIRL, with parallel sampling, continous/discrete action space
Expert trajectory generator
parallel sampling feature can greatly speed up the overall training process especially with HPC

Run the codes

PPO: python run_ppo_combo_gym.py
Generate expert trajectory: python sample_expert_data.py
AIRL: python run_AIRL_combo_gym.py

Tune the hyperparameter

The hyperparameters can be changed in argparser() or command line, e.g., python run_ppo_combo_gym.py --clip_value 0.1
The hyperparameters args.num_parallel_sampler setups the number of parallel samplers to be deployed
The hyperparameters args.sample_size setups the total number of samples per iteration

Some results

The PPO and AIRL have been tested with openai-gym environments, e.g., CartPole-v1, Pendulum-v0, and BipedalWalker-v2
Some training results and models are saved in the directories
The training result with BipedalWalker-v2 is shown here as an example.

PPO: AIRL:

Owner

Fangjian Li

Fangjian Li

GitHub Repository

Adversarial Attacks are Reversible via Natural Supervision

Adversarial Attacks are Reversible via Natural Supervision ICCV2021 Citation @InProceedings{Mao_2021_ICCV, author = {Mao, Chengzhi and Chiquier

20 May 22, 2022

Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

cppn-gan-vae tensorflow Train Compositional Pattern Producing Network as a Generative Model, using Generative Adversarial Networks and Variational Aut

343 Dec 29, 2022

Code to accompany the paper "Finding Bipartite Components in Hypergraphs", which is published in NeurIPS'21.

Finding Bipartite Components in Hypergraphs This repository contains code to accompany the paper "Finding Bipartite Components in Hypergraphs", publis

5 May 06, 2022

Implementation of Segnet, FCN, UNet , PSPNet and other models in Keras.

Image Segmentation Keras : Implementation of Segnet, FCN, UNet, PSPNet and other models in Keras. Implementation of various Deep Image Segmentation mo

2.6k Jan 05, 2023

Red Team tool for exfiltrating files from a target's Google Drive that you have access to, via Google's API.

GD-Thief Red Team tool for exfiltrating files from a target's Google Drive that you(the attacker) has access to, via the Google Drive API. This includ

39 Dec 27, 2022

Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

1.4k Jan 08, 2023

PyTorch implementation of the Value Iteration Networks (VIN) (NIPS '16 best paper)

Value Iteration Networks in PyTorch Tamar, A., Wu, Y., Thomas, G., Levine, S., and Abbeel, P. Value Iteration Networks. Neural Information Processing

75 Nov 24, 2022

Customizable RecSys Simulator for OpenAI Gym

gym-recsys: Customizable RecSys Simulator for OpenAI Gym Installation | How to use | Examples | Citation This package describes an OpenAI Gym interfac

14 Dec 08, 2022

official code for dynamic convolution decomposition

Revisiting Dynamic Convolution via Matrix Decomposition (ICLR 2021) A pytorch implementation of DCD. If you use this code in your research please cons

110 Nov 23, 2022

Dahua Camera and Doorbell Home Assistant Integration

Home Assistant Dahua Integration The Dahua Home Assistant integration allows you to integrate your Dahua cameras and doorbells in Home Assistant. It's

216 Dec 26, 2022

Preparation material for Dropbox interviews

Dropbox-Onsite-Interviews A guide for the Dropbox onsite interview! The Dropbox interview question bank is very small. The bank has been in a Chinese

386 Dec 31, 2022

The Instructed Glacier Model (IGM)

The Instructed Glacier Model (IGM) Overview The Instructed Glacier Model (IGM) simulates the ice dynamics, surface mass balance, and its coupling thro

27 Dec 16, 2022

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

166 Jan 04, 2023

🇰🇷 Text to Image in Korean

KoDALLE Utilizing pretrained language model’s token embedding layer and position embedding layer as DALLE’s text encoder. Background Training DALLE mo

74 Sep 22, 2022

The implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021

DynamicNeuralGarments Introduction This repository contains the implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021. ./GarmentMoti

42 Dec 27, 2022

给yolov5加个gui界面，使用pyqt5，yolov5是5.0版本

博文地址 https://xugaoxiang.com/2021/06/30/yolov5-pyqt5 代码执行项目中使用YOLOv5的v5.0版本，界面文件是project.ui pip install -r requirements.txt python main.py 图片检测视频检测

215 Dec 30, 2022

Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

RTK-PAD This is an official pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model', which is accepted by IEEE T

6 Aug 01, 2022

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

4.7k Jan 01, 2023

API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

RL - EmsPy (work In Progress...) The EmsPy Python package was made to facilitate Reinforcement Learning (RL) algorithm research for developing and tes

20 Jan 05, 2023

使用OpenCV部署全景驾驶感知网络YOLOP，可同时处理交通目标检测、可驾驶区域分割、车道线检测，三项视觉感知任务，包含C++和Python两种版本的程序实现。本套程序只依赖opencv库就可以运行，从而彻底摆脱对任何深度学习框架的依赖。

YOLOP-opencv-dnn 使用OpenCV部署全景驾驶感知网络YOLOP，可同时处理交通目标检测、可驾驶区域分割、车道线检测，三项视觉感知任务，依然是包含C++和Python两种版本的程序实现 onnx文件从百度云盘下载，链接：https://pan.baidu.com/s/1A_9cldU

178 Jan 07, 2023