Policy Gradient Algorithms (One Step Actor Critic & PPO) from scratch using Numpy

Overview

Policy Gradient Algorithms From Scratch (NumPy)

This repository showcases two policy gradient algorithms (One Step Actor Critic and Proximal Policy Optimization) applied to two MDPs. The algorithms are implemented from scratch with Numpy and utilize linear regression for the value function and single layer Softmax for the policy. The MDPs are: Gridworld and Mountain Car.

Run Instructions

Packages:

numpy and matplotlib

Create virtual environment, install requirements and run: (windows instructions)

  1. Run python -m venv venv
  2. Run .\venv\Scripts\activate (windows)
  3. Run pip install -r requirements.txt
  4. Run python .\experiments.py be wary of long compute times and plots that will pop up and must be exited in order to comtinue.

Some Sample Plots

Files

  • experiments.py - Runs pre programmed experiments that output various plots both in the terminal and saved to .png files.
  • mdp.py - Contains two MDP domains: Gridworld and Mountain Car, that the experiments are run on.
  • models.py - Contains ValueFunction and Policy which are the two models used (linear layers) for function approximation by the algorithms.
  • policy_gradient_algorithms.py - Contains the policy gradient algorithms One Step Actor Critic and Proximal Policy Optimization (PPO).

MIT License

A* (with 2 heuristic functions), BFS , DFS and DFS iterativeA* (with 2 heuristic functions), BFS , DFS and DFS iterative

Descpritpion This project solves the Taquin game (jeu de taquin) problem using different algorithms : A* (with 2 heuristic functions), BFS , DFS and D

Ayari Ahmed 3 May 09, 2022
Sorting-Algorithms - All information about sorting algorithm you need and you can visualize the code tracer

Sorting-Algorithms - All information about sorting algorithm you need and you can visualize the code tracer

Ahmed Hossam 15 Oct 16, 2022
All Algorithms implemented in Python

The Algorithms - Python All algorithms implemented in Python (for education) These implementations are for learning purposes only. Therefore they may

The Algorithms 150.6k Jan 03, 2023
Pathfinding visualizer in pygame: A*

Pathfinding Visualizer A* What is this A* algorithm ? Simply put, it is an algorithm that aims to find the shortest possible path between two location

0 Feb 26, 2022
GoldenSAML Attack Libraries and Framework

WhiskeySAML and Friends TicketsPlease TicketsPlease: Python library to assist with the generation of Kerberos tickets, remote retrieval of ADFS config

Secureworks 43 Jan 03, 2023
This project consists of a collaborative filtering algorithm to predict movie reviews ratings from a dataset of Netflix ratings.

Collaborative Filtering - Netflix movie reviews Description This project consists of a collaborative filtering algorithm to predict movie reviews rati

Shashank Kumar 1 Dec 21, 2021
A Python project for optimizing the 8 Queens Puzzle using the Genetic Algorithm implemented in PyGAD.

8QueensGenetic A Python project for optimizing the 8 Queens Puzzle using the Genetic Algorithm implemented in PyGAD. The project uses the Kivy cross-p

Ahmed Gad 16 Nov 13, 2022
Python sample codes for robotics algorithms.

PythonRobotics Python codes for robotics algorithm. Table of Contents What is this? Requirements Documentation How to use Localization Extended Kalman

Atsushi Sakai 17.2k Jan 01, 2023
Sign data using symmetric-key algorithm encryption.

Sign data using symmetric-key algorithm encryption. Validate signed data and identify possible validation errors. Uses sha-(1, 224, 256, 385 and 512)/hmac for signature encryption. Custom hash algori

Artur Barseghyan 39 Jun 10, 2022
An implementation of ordered dithering algorithm in python as multimedia course project

One way of minimizing the size of an image is to simply reduce the number of bits you use to represent each pixel.

7 Dec 02, 2022
This application solves sudoku puzzles using a backtracking recursive algorithm

This application solves sudoku puzzles using a backtracking recursive algorithm. The user interface is coded with Pygame to allow users to easily input puzzles.

Glenda T 0 May 17, 2022
A lightweight, pure-Python mobile robot simulator designed for experiments in Artificial Intelligence (AI) and Machine Learning, especially for Jupyter Notebooks

aitk.robots A lightweight Python robot simulator for JupyterLab, Notebooks, and other Python environments. Goals A lightweight mobile robotics simulat

3 Oct 22, 2021
Classic algorithms including Fizz Buzz, Bubble Sort, the Fibonacci Sequence, a Sudoku solver, and more.

Algorithms Classic algorithms including Fizz Buzz, Bubble Sort, the Fibonacci Sequence, a Sudoku solver, and more. Algorithm Complexity Time and Space

1 Jan 14, 2022
How on earth can I ever think of a solution like that in an interview?!

fuck-coding-interviews This repository is created by an awkward programmer who always struggles with coding problems on LeetCode, even with some Easy

Vinta Chen 613 Jan 08, 2023
This project is an implementation of a simple K-means algorithm

Simple-Kmeans-Clustering-Algorithm Abstract K-means is a centroid-based algorithm, or a distance-based algorithm, where we calculate the distances to

Saman Khamesian 7 Aug 09, 2022
A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

A simple python implementation of A* and bfs algorithm solving Eight-Puzzle

2 May 22, 2022
This repository explores an implementation of Grover's Algorithm for knights on a chessboard.

Grover Knights Welcome to my Knights project! Project Description: I explore an implementation of a quantum oracle for knights on a chessboard.

Will Sun 8 Feb 22, 2022
A pure Python implementation of a mixed effects random forest (MERF) algorithm

Mixed Effects Random Forest This repository contains a pure Python implementation of a mixed effects random forest (MERF) algorithm. It can be used, o

Manifold 199 Dec 06, 2022
Multiple Imputation with Random Forests in Python

miceforest: Fast, Memory Efficient Imputation with lightgbm Fast, memory efficient Multiple Imputation by Chained Equations (MICE) with lightgbm. The

Samuel Wilson 202 Dec 31, 2022
Algorithms and data structures for educational, demonstrational and experimental purposes.

Algorithms and Data Structures (ands) Introduction This project was created for personal use mostly while studying for an exam (starting in the month

50 Dec 06, 2022