RL algorithm PPO and IRL algorithm AIRL written with Tensorflow.

Overview

Key packages verison

  • numpy==1.16
  • tensorflow==1.14
  • gym==0.15.4
  • ray==1.2

What can this repository do

  • Reinforcement learning algorithm PPO, with parallel sampling, continous/discrete action space
  • Inverse reinforcement learning algorithm AIRL, with parallel sampling, continous/discrete action space
  • Expert trajectory generator
  • parallel sampling feature can greatly speed up the overall training process especially with HPC

Run the codes

  • PPO: python run_ppo_combo_gym.py
  • Generate expert trajectory: python sample_expert_data.py
  • AIRL: python run_AIRL_combo_gym.py

Tune the hyperparameter

  • The hyperparameters can be changed in argparser() or command line, e.g., python run_ppo_combo_gym.py --clip_value 0.1
  • The hyperparameters args.num_parallel_sampler setups the number of parallel samplers to be deployed
  • The hyperparameters args.sample_size setups the total number of samples per iteration

Some results

  • The PPO and AIRL have been tested with openai-gym environments, e.g., CartPole-v1, Pendulum-v0, and BipedalWalker-v2
  • Some training results and models are saved in the directories
  • The training result with BipedalWalker-v2 is shown here as an example.


PPO: AIRL:

Owner
Fangjian Li
Fangjian Li
Adversarial Attacks are Reversible via Natural Supervision

Adversarial Attacks are Reversible via Natural Supervision ICCV2021 Citation @InProceedings{Mao_2021_ICCV, author = {Mao, Chengzhi and Chiquier

Computer Vision Lab at Columbia University 20 May 22, 2022
Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

cppn-gan-vae tensorflow Train Compositional Pattern Producing Network as a Generative Model, using Generative Adversarial Networks and Variational Aut

hardmaru 343 Dec 29, 2022
Code to accompany the paper "Finding Bipartite Components in Hypergraphs", which is published in NeurIPS'21.

Finding Bipartite Components in Hypergraphs This repository contains code to accompany the paper "Finding Bipartite Components in Hypergraphs", publis

Peter Macgregor 5 May 06, 2022
Implementation of Segnet, FCN, UNet , PSPNet and other models in Keras.

Image Segmentation Keras : Implementation of Segnet, FCN, UNet, PSPNet and other models in Keras. Implementation of various Deep Image Segmentation mo

Divam Gupta 2.6k Jan 05, 2023
Red Team tool for exfiltrating files from a target's Google Drive that you have access to, via Google's API.

GD-Thief Red Team tool for exfiltrating files from a target's Google Drive that you(the attacker) has access to, via the Google Drive API. This includ

Antonio Piazza 39 Dec 27, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
PyTorch implementation of the Value Iteration Networks (VIN) (NIPS '16 best paper)

Value Iteration Networks in PyTorch Tamar, A., Wu, Y., Thomas, G., Levine, S., and Abbeel, P. Value Iteration Networks. Neural Information Processing

LEI TAI 75 Nov 24, 2022
Customizable RecSys Simulator for OpenAI Gym

gym-recsys: Customizable RecSys Simulator for OpenAI Gym Installation | How to use | Examples | Citation This package describes an OpenAI Gym interfac

Xingdong Zuo 14 Dec 08, 2022
official code for dynamic convolution decomposition

Revisiting Dynamic Convolution via Matrix Decomposition (ICLR 2021) A pytorch implementation of DCD. If you use this code in your research please cons

Yunsheng Li 110 Nov 23, 2022
Dahua Camera and Doorbell Home Assistant Integration

Home Assistant Dahua Integration The Dahua Home Assistant integration allows you to integrate your Dahua cameras and doorbells in Home Assistant. It's

Ronnie 216 Dec 26, 2022
Preparation material for Dropbox interviews

Dropbox-Onsite-Interviews A guide for the Dropbox onsite interview! The Dropbox interview question bank is very small. The bank has been in a Chinese

386 Dec 31, 2022
The Instructed Glacier Model (IGM)

The Instructed Glacier Model (IGM) Overview The Instructed Glacier Model (IGM) simulates the ice dynamics, surface mass balance, and its coupling thro

27 Dec 16, 2022
The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Will Thompson 166 Jan 04, 2023
🇰🇷 Text to Image in Korean

KoDALLE Utilizing pretrained language model’s token embedding layer and position embedding layer as DALLE’s text encoder. Background Training DALLE mo

HappyFace 74 Sep 22, 2022
The implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021

DynamicNeuralGarments Introduction This repository contains the implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021. ./GarmentMoti

42 Dec 27, 2022
给yolov5加个gui界面,使用pyqt5,yolov5是5.0版本

博文地址 https://xugaoxiang.com/2021/06/30/yolov5-pyqt5 代码执行 项目中使用YOLOv5的v5.0版本,界面文件是project.ui pip install -r requirements.txt python main.py 图片检测 视频检测

Xu GaoXiang 215 Dec 30, 2022
Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

RTK-PAD This is an official pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model', which is accepted by IEEE T

6 Aug 01, 2022
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

DLR-RM 4.7k Jan 01, 2023
API for RL algorithm design & testing of BCA (Building Control Agent) HVAC on EnergyPlus building energy simulator by wrapping their EMS Python API

RL - EmsPy (work In Progress...) The EmsPy Python package was made to facilitate Reinforcement Learning (RL) algorithm research for developing and tes

20 Jan 05, 2023
使用OpenCV部署全景驾驶感知网络YOLOP,可同时处理交通目标检测、可驾驶区域分割、车道线检测,三项视觉感知任务,包含C++和Python两种版本的程序实现。本套程序只依赖opencv库就可以运行, 从而彻底摆脱对任何深度学习框架的依赖。

YOLOP-opencv-dnn 使用OpenCV部署全景驾驶感知网络YOLOP,可同时处理交通目标检测、可驾驶区域分割、车道线检测,三项视觉感知任务,依然是包含C++和Python两种版本的程序实现 onnx文件从百度云盘下载,链接:https://pan.baidu.com/s/1A_9cldU

178 Jan 07, 2023