Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Overview

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

This repository is the official implementation of

  • Seohong Park, Jaekyeom Kim, Gunhee Kim. Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods. In NeurIPS, 2021.

It contains the implementations for SAR, FiGAR-C and base policy gradient algorithms (PPO, TRPO and A2C).

The code is based on Stable Baselines3 (SB3) for PPO and A2C, and Stable Baselines (SB) for TRPO.

Requirements

Run examples

PPO and A2C (based on Stable Baselines3)

To install requirements:

cd sb3
pip install -r requirements.txt
pip install -e .

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train FiGAR-C-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05

Train PPO on InvertedPendulum-v2 with the original δ:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg

Train SAR-A2C on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=a2c_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Action Noise" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=action --anoise_prob=0.05 --anoise_std=3

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "External Force" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_f --anoise_prob=0.05 --anoise_std=300

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Strong External Force (Perceptible)" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_fpc --anoise_prob=0.05 --anoise_std=1000

TRPO (based on Stable Baselines)

To install requirements:

cd sb
pip install -r requirements.txt
pip install -e .

Train SAR-TRPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

License

This codebase is licensed under the MIT License. See also sb3/LICENSE_SB3 and sb/LICENSE_SB.

Owner
Seohong Park
Seohong Park
If you are worried about being found perhaps try taking cover under a blanket. Pure Python PowerShell Obfuscator

If you are worried about being found perhaps try taking cover under a blanket. Pure Python PowerShell Obfuscator

Ph0tonz 3 Jun 07, 2022
带回显版本的漏洞利用脚本

CVE-2021-21978 带回显版本的漏洞利用脚本,更简单的方式 0. 漏洞信息 VMware View Planner Web管理界面存在一个上传日志功能文件的入口,没有进行认证且写入的日志文件路径用户可控,通过覆盖上传日志功能文件log_upload_wsgi.py,即可实现RCE 漏洞代码

3ky7in4 24 Nov 09, 2022
This repository will contain python scripts for hackers and pentesters

This repository will contain python scripts for hackers and pentesters. stop being limited with availble tools. Build your own.

0xTRAW 24 Nov 29, 2022
Confluence OGNL injection

CVE-2021-26084 Confluence OGNL injection CVE-2021-26084 is an Object-Graph Navigation Language (OGNL) injection vulnerability in the Atlassian Conflue

Ashish Kunwar 15 Sep 23, 2022
A decompilation of the Nintendo Switch version of Captain Toad: Treasure Tracker

cttt-decomp A decompilation of the Nintendo Switch version of Captain Toad: Trea

shibbs 14 Aug 17, 2022
Script Crack Facebook Yang Kaya Akan Teh Hijau 🚶‍♂

r-mbf Script Crack Facebook 🚶‍♂ Bukti Recode [•] Install Script $ pkg update && pkg upgrade $ pkg install python $ pkg install git $ pip install requ

O'Hayo Smrn 3 Apr 02, 2022
Get important strings inside [Info.plist] & and Binary file also all output of result it will be saved in [app_binary].json , [app_plist_file].json file

Get important strings inside [Info.plist] & and Binary file also all output of result it will be saved in [app_binary].json , [app_plist_file].json file

12 Sep 28, 2022
A script based on sqlmap that uses sql injection vulnerabilities to traverse the existence of a file

A script based on sqlmap that uses sql injection vulnerabilities to traverse the existence o

2 Nov 09, 2022
Dapunta Multi Brute Force Facebook - Crack Facebook With Login - Free

✭ DMBF CRACK Dibuat Dengan ❤️ Oleh Dapunta Author: - Dapunta Khurayra X ⇨ Fitur Login [✯] Login Token ⇨ Fitur Crack [✯] Crack Dari Teman, Public,

Dapunta ID 10 Oct 19, 2022
pybotnet - A Python Library for building Botnet , Trojan or BackDoor for windows and linux with Telegram control panel

pybotnet A Python Library for building botnet , trojan or backdoor for windows and linux with Telegram control panel Disclaimer: Please note that this

</oNion 181 Jan 02, 2023
PoC for CVE-2020-6207 (Missing Authentication Check in SAP Solution Manager)

PoC for CVE-2020-6207 (Missing Authentication Check in SAP Solution Manager) This script allows to check and exploit missing authentication checks in

chipik 82 Nov 09, 2022
Static Token And Credential Scanner

Static Token And Credential Scanner What is it? STACS is a YARA powered static credential scanner which suports binary file formats, analysis of neste

STACS 81 Dec 27, 2022
NExfil is an OSINT tool written in python for finding profiles by username.

NExfil is an OSINT tool written in python for finding profiles by username. The provided usernames are checked on over 350 websites within few seconds.

thewhiteh4t 1.4k Jan 01, 2023
Ducky Script is the payload language of Hak5 gear.

Ducky Script is the payload language of Hak5 gear. Since its introduction with the USB Rubber Ducky in 2010, Ducky Script has grown in capability while maintaining simplicity. Aided by Bash for logic

Abir Abedin Khan 6 Oct 07, 2022
A Telegram Bot to force users to join a specific channel before sending messages in a group.

Promoter A Telegram Bot to force users to join a specific channel before sending messages in a group. Introduction A Telegram Bot to force users to jo

Mr. Dynamic 1 Jan 27, 2022
Remote Desktop Protocol in Twisted Python

RDPY Remote Desktop Protocol in twisted python. RDPY is a pure Python implementation of the Microsoft RDP (Remote Desktop Protocol) protocol (client a

Sylvain Peyrefitte 1.6k Dec 30, 2022
A compact version of EDI-Vetter, which uses the TLS output to quickly vet transit signals.

A compact version of EDI-Vetter, which uses the TLS output to quickly vet transit signals. All your favorite hits in a simplified format.

Jon Zink 2 Aug 03, 2022
Exploiting CVE-2021-44228 in Unifi Network Application for remote code execution and more

Log4jUnifi Exploiting CVE-2021-44228 in Unifi Network Application for remote cod

96 Jan 02, 2023
Program that mathematically generates and validates CPF numbers

✔️ Gerador e Validador de CPF Programa que gera e valida números de CPF Requisitos • Como usar • Capturas de Tela Requisitos Antes de começar, você va

João Victor Vilela dos Santos 1 Nov 07, 2021
Tenssens framework focused on gathering information from free tools or resources. The intention is to help people find free OSINT resources.

Tenssens framework focused on gathering information from free tools or resources. The intention is to help people find free OSINT resources.

Md. Nur habib 31 Oct 21, 2022