Safe Policy Optimization with Local Features

Overview

Safe Policy Optimization with Local Feature (SPO-LF)

This is the source-code for implementing the algorithms in the paper "Safe Policy Optimization with Local Generalized Linear Function Approximations" which was presented in NeurIPS-21.

Installation

There is requirements.txt in this repository. Except for the common modules (e.g., numpy, scipy), our source code depends on the following modules.

We also provide Dockerfile in this repository, which can be used for reproducing our grid-world experiment.

Simulation configuration

We manage the simulation configuration using hydra. Configurations are listed in config.yaml. For example, the algorithm to run should be chosen from the ones we implemented:

sim_type: {safe_glm, unsafe_glm, random, oracle, safe_gp_state, safe_gp_feature, safe_glm_stepwise}

Grid World Experiment

The source code necessary for our grid-world experiment is contained in /grid_world folder. To run the simulation, for example, use the following commands.

cd grid_world
python main.py sim_type=safe_glm env.reuse_env=False

For the monte carlo simulation while comparing our proposed method with baselines, use the shell file, run.sh.

We also provide a script for visualization. If you want to render how the agent behaves, use the following command.

python main.py sim_type=safe_glm env.reuse_env=True

Safety-Gym Experiment

The source code necessary for our safety-gym experiment is contained in /safety_gym_discrete folder. Our experiment is based on safety-gym. Our proposed method utilize dynamic programming algorithms to solve Bellman Equation, so we modified engine.py to discrtize the environment. We attach modified safety-gym source code in /safety_gym_discrete/engine.py. To use the modified library, please clone safety-gym, then replace safety-gym/safety_gym/envs/engine.py using /safety_gym_discrete/engine.py in our repo. Using the following commands to install the modified library:

cd safety_gym
pip install -e .

Note that MuJoCo licence is needed for installing Safety-Gym. To run the simulation, use the folowing commands.

cd safety_gym_discrete
python main.py sim_idx=0

We compare our proposed method with three notable baselines: CPO, PPO-Lagrangian, and TRPO-Lagrangian. The baseline implementation depends on safety-starter-agents. We modified run_agent.py in the repo source code.

To run the baseline, use the folowing commands.

cd safety_gym_discrete/baseline
python baseline_run.py sim_type=cpo

The environment that agent runs on is generated using generate_env.py. We provide 10 50*50 environments. If you want to generate other environments, you can change the world shape in safety_gym_discrete.py, and running the following commands:

cd safety_gym_discrete
python generate_env.py

Citation

If you find this code useful in your research, please consider citing:

@inproceedings{wachi_yue_sui_neurips2021,
  Author = {Wachi, Akifumi and Wei, Yunyue and Sui, Yanan},
  Title = {Safe Policy Optimization with Local Generalized Linear Function Approximations},
  Booktitle  = {Neural Information Processing Systems (NeurIPS)},
  Year = {2021}
}
Owner
Akifumi Wachi
Akifumi Wachi
Red Team Toolkit is an Open-Source Django Offensive Web-App which is keeping the useful offensive tools used in the red-teaming together.

RedTeam Toolkit Note: Only legal activities should be conducted with this project. Red Team Toolkit is an Open-Source Django Offensive Web-App contain

Mohammadreza Sarayloo 382 Jan 01, 2023
Search Shodan for Minecraft server IPs to grief

GriefBuddy This script searches Shodan for Minecraft server IPs to grief. This will return all servers connected to the public internet which Shodan h

26 Dec 29, 2022
Scan all java processes on your host to check weather it's affected by log4j2 remote code execution

Log4j2 Vulnerability Local Scanner (CVE-2021-45046) Log4j 漏洞本地检测脚本,扫描主机上所有java进程,检测是否引入了有漏洞的log4j-core jar包,是否可能遭到远程代码执行攻击(CVE-2021-45046)。上传扫描报告到指定的服

86 Dec 09, 2022
Herramienta para descargar eventos de Sucuri WAF hacia disco.

Descarga los eventos de Sucuri Script para descargar los eventos del Sucuri Web Application Firewall (WAF) en el disco como archivos CSV. Requerimient

CSIRT-RD 2 Nov 29, 2021
PyExtractor is a decompiler that can fully decompile exe's compiled with pyinstaller or py2exe

PyExtractor is a decompiler that can fully decompile exe's compiled with pyinstaller or py2exe with additional features such as malware checker/detector! Also checks file(s) for suspicious words, dis

Rdimo 56 Jul 31, 2022
Whois-Python - Get Whois Domain with Python GUI

Whois-Python-GUI Get Whois Domain with Python - GUI :) WARNING Dont Copy ! - W

MR.D3F417 3 Feb 21, 2022
😭 WSOB is a python tool created to exploit the new vulnerability on WSO2 assigned as CVE-2022-29464.

😭 WSOB (CVE-2022-29464) 😭 WSOB is a python tool created to exploit the new vulnerability on WSO2 assigned as CVE-2022-29464. CVE-2022-29464 details:

0p 25 Oct 14, 2022
This is a partial and quick and dirty proof of concept implementation of the following specifications to configure a tor client to use trusted exit relays only.

This is a partial and quick and dirty proof of concept implementation of the following specifications to configure a tor client to use trusted exit re

22 Nov 09, 2022
This is simple python FTP password craker. To crack FTP login using wordlist based brute force attack

This is simple python FTP password craker. To crack FTP login using wordlist based brute force attack

Varun Jagtap 5 Oct 08, 2022
orfipy is a tool written in python/cython to extract ORFs in an extremely and fast and flexible manner

Introduction orfipy is a tool written in python/cython to extract ORFs in an extremely and fast and flexible manner. Other popular ORF searching tools

Urminder Singh 34 Nov 21, 2022
Course: Information Security with Python

Curso: Segurança da Informação com Python Curso realizado atravès da Plataforma da Digital Innovation One Prof: Bruno Dias Conteúdo: Introdução aos co

Elizeu Barbosa Abreu 1 Nov 28, 2021
🎻 Modularized exploit generation framework

Modularized exploit generation framework for x86_64 binaries Overview This project is still at early stage of development, so you might want to come b

ᴀᴇꜱᴏᴘʜᴏʀ 30 Jan 17, 2022
python driver for fingerprint machine (ZKTeco biometrics)

fpmachine python driver for fingerprint machine (ZKTeco biometrics) support until now 2 model supported and tested ZMM100_TFT and ZMM220_TFT install p

Samy Sultan 4 Oct 06, 2022
RedlineSpam - Python tool to spam Redline Infostealer panels with legit looking data

RedlineSpam Python tool to spam Redline Infostealer panels with legit looking da

4 Jan 27, 2022
LinOTP - the open source solution for two factor authentication

LinOTP LinOTP - the Open Source solution for multi-factor authentication Copyright © 2010-2019 KeyIdentity GmbH Coypright © 2019- arxes-tolina GmbH In

LinOTP 462 Jan 02, 2023
CVE-2022-22965 : about spring core rce

CVE-2022-22965: Spring-Core-Rce EXP 特性: 漏洞探测(不写入 webshell,简单字符串输出) 自定义写入 webshell 文件名称及路径 不会追加写入到同一文件中,每次检测写入到不同名称 webshell 文件 支持写入 冰蝎 webshell 代理支持,可

东方有鱼名为咸 53 Nov 09, 2022
Android Malware (Analysis | Scoring) System

An Obfuscation-Neglect Android Malware Scoring System Quark-Engine is also bundled with Kali Linux, BlackArch. A trust-worthy, practical tool that's r

Quark-Engine 1k Jan 04, 2023
Show apps recorded storage files by jailbreak

0x101 Show registered storage files of apps by jailbreak Legal disclaimer: Usage of insTof for attacking targets without prior mutual consent is illeg

0x 4 Oct 24, 2022
Website OSINT untuk mencari informasi dari email dan nomor telepon. Dibuat dengan React dan Flask.

Inspektur Cari informasi mengenai email dan nomor telepon dengan mudah. Inspektur adalah aplikasi OSINT yang berguna untuk mencari informasi berdasark

Bagas Wastu 36 Dec 04, 2022
Vulmap 是一款 web 漏洞扫描和验证工具, 可对 webapps 进行漏洞扫描, 并且具备漏洞利用功能

Vulmap 是一款 web 漏洞扫描和验证工具, 可对 webapps 进行漏洞扫描, 并且具备漏洞利用功能

之乎者也 2.8k Dec 29, 2022