Policy Gradient Algorithms (One Step Actor Critic & PPO) from scratch using Numpy

Overview

Policy Gradient Algorithms From Scratch (NumPy)

This repository showcases two policy gradient algorithms (One Step Actor Critic and Proximal Policy Optimization) applied to two MDPs. The algorithms are implemented from scratch with Numpy and utilize linear regression for the value function and single layer Softmax for the policy. The MDPs are: Gridworld and Mountain Car.

Run Instructions

Packages:

numpy and matplotlib

Create virtual environment, install requirements and run: (windows instructions)

  1. Run python -m venv venv
  2. Run .\venv\Scripts\activate (windows)
  3. Run pip install -r requirements.txt
  4. Run python .\experiments.py be wary of long compute times and plots that will pop up and must be exited in order to comtinue.

Some Sample Plots

Files

  • experiments.py - Runs pre programmed experiments that output various plots both in the terminal and saved to .png files.
  • mdp.py - Contains two MDP domains: Gridworld and Mountain Car, that the experiments are run on.
  • models.py - Contains ValueFunction and Policy which are the two models used (linear layers) for function approximation by the algorithms.
  • policy_gradient_algorithms.py - Contains the policy gradient algorithms One Step Actor Critic and Proximal Policy Optimization (PPO).

MIT License

Python algorithm to determine the optimal elevation threshold of a GNSS receiver, by using a statistical test known as the Brown-Forsynthe test.

Levene and Brown-Forsynthe: Test for variances Application to Global Navigation Satellite Systems (GNSS) Python algorithm to determine the optimal ele

Nicolas Gachancipa 2 Aug 09, 2022
Official implementation of "Path Planning using Neural A* Search" (ICML-21)

Path Planning using Neural A* Search (ICML 2021) This is a repository for the following paper: Ryo Yonetani*, Tatsunori Taniai*, Mohammadamin Barekata

OMRON SINIC X 82 Jan 07, 2023
A simple library for implementing common design patterns.

PyPattyrn from pypattyrn.creational.singleton import Singleton class DummyClass(object, metaclass=Singleton): # DummyClass is now a Singleton!

1.7k Jan 01, 2023
A raw implementation of the nearest insertion algorithm to resolve TSP problems in a TXT format.

TSP-Nearest-Insertion A raw implementation of the nearest insertion algorithm to resolve TSP problems in a TXT format. Instructions Load a txt file wi

sjas_Phantom 1 Dec 02, 2021
causal-learn: Causal Discovery for Python

causal-learn: Causal Discovery for Python Causal-learn is a python package for causal discovery that implements both classical and state-of-the-art ca

589 Dec 29, 2022
PathPlanning - Common used path planning algorithms with animations.

Overview This repository implements some common path planning algorithms used in robotics, including Search-based algorithms and Sampling-based algori

Huiming Zhou 5.1k Jan 08, 2023
N Queen Problem using Genetic Algorithm

The N Queen is the problem of placing N chess queens on an N×N chessboard so that no two queens attack each other.

Mahdi Hassanzadeh 2 Nov 11, 2022
🌟 Python algorithm team note for programming competition or coding test

🌟 Python algorithm team note for programming competition or coding test

Seung Hoon Lee 3 Feb 25, 2022
Algorithms and utilities for SAR sensors

WARNING: THIS CODE IS NOT READY FOR USE Sarsen Algorithms and utilities for SAR sensors Objectives Be faster and simpler than ESA SNAP and cloud nativ

B-Open 201 Dec 27, 2022
The DarkRift2 networking framework written in Python 3

DarkRiftPy is Darkrift2 written in Python 3. The implementation is fully compatible with the original version. So you can write a client side on Python that connects to a Darkrift2 server written in

Anton Dobryakov 6 May 23, 2022
HashDB is a community-sourced library of hashing algorithms used in malware.

HashDB HashDB is a community-sourced library of hashing algorithms used in malware. How To Use HashDB HashDB can be used as a stand alone hashing libr

OALabs 216 Jan 06, 2023
A* (with 2 heuristic functions), BFS , DFS and DFS iterativeA* (with 2 heuristic functions), BFS , DFS and DFS iterative

Descpritpion This project solves the Taquin game (jeu de taquin) problem using different algorithms : A* (with 2 heuristic functions), BFS , DFS and D

Ayari Ahmed 3 May 09, 2022
Implementation of an ordered dithering algorithm used in computer graphics

Ordered Dithering Project In this project, we use an ordered dithering method to turn an RGB image, first to a gray scale image and then, turn the gra

1 Oct 26, 2021
This project consists of a collaborative filtering algorithm to predict movie reviews ratings from a dataset of Netflix ratings.

Collaborative Filtering - Netflix movie reviews Description This project consists of a collaborative filtering algorithm to predict movie reviews rati

Shashank Kumar 1 Dec 21, 2021
A collection of design patterns/idioms in Python

python-patterns A collection of design patterns and idioms in Python. Current Patterns Creational Patterns: Pattern Description abstract_factory use a

Sakis Kasampalis 36.2k Jan 05, 2023
iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms.

iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms. You can find its main page and description via this link. If you are familiar with NILM-TK API

Mozaffar Etezadifar 3 Mar 19, 2022
Python Client for Algorithmia Algorithms and Data API

Algorithmia Common Library (python) Python client library for accessing the Algorithmia API For API documentation, see the PythonDocs Algorithm Develo

Algorithmia 138 Oct 26, 2022
Optimal skincare partition finder using graph theory

Pigment The problem of partitioning up a skincare regime into parts such that each part does not interfere with itself is equivalent to the minimal cl

Jason Nguyen 1 Nov 22, 2021
Using Bayesian, KNN, Logistic Regression to classify spam and non-spam.

Make Sure the dataset file "spamData.mat" is in the folder spam\src Environment: Python --version = 3.7 Third Party: numpy, matplotlib, math, scipy

0 Dec 26, 2021
My own Unicode compression algorithm

Zee Code ZCode is a custom compression algorithm I originally developed for a competition held for the Spring 2019 Datastructures and Algorithms cours

Vahid Zehtab 2 Oct 20, 2021