Admin Panels
Algorithms
Asset Management
Audio
Authentication
More Categories
Boilerplate Build Tools Caching CMS Code Analysis Code Refactoring Code review tool Command-line Interface Development Command-line Tools Communication Computer Vision Concurrency and Parallelism Configuration Cryptography Data Analysis Data Containers Data Serialization Data Structures Data Validation Data Visualization Database Database Drivers Date & Time Utilities Debugging Tools Deep Learning Deep Learning Model Explanation DevOps Tools Distributed Computing Distribution Django Documentation Downloader E-commerce Editor Plugins Email Environment Management FastAPI Projects FastAPI Utilities Feature Engineering File & Path Utilities Finance Flask Forms Functional Programming Game Development General Utilities Geolocation GPU Utilities GraphQL GUI Development Hardware HTML Manipulation HTTP Clients IDE Image Processing Implementations of Python Internationalization Interpreter Job Scheduler JSON Linters & Style Checkers Logging Machine Learning Markdown/YAML Microsoft Windows Miscellaneous Monitoring Network Virtualization Networking Office Files Processing Organization ORM Package Management Payment Processing PDF Files Processing Performance optimization Pipelines Process Utilities Productivity PyTorch Learning Resources Pytorch Utilities Recommender Systems Reinforcement Learning RESTful API RPC Servers Science SCM Search Security related resources Serialization Serverless Frameworks Sklearn Utilities Specific Formats Processing Static Site Generator Storage Task Queues Template Engine Testing Text Data & NLP Text Processing Third-party APIs Wrappers URL Manipulation Video Web Asset Management Web Content Extracting Web Crawling Web Frameworks WebSocket WSGI Servers
Popular Repo
Latest Repo
Resources
All Article News Book Tutorial

Overview
Comments 1
Releases

Reinforcement Learning Theory Book (rus)

Last update: Nov 27, 2022

Related tags

Deep Learning RL-Theory-book

Overview

Reinforcement Learning Theory Book (rus)

Full book on Arxiv: https://arxiv.org/abs/2201.09746

Ch. 1: Introduction
Ch. 2: Meta-heuristics
- NEAT, WANN
- CEM, OpenAI-ES, CMA-ES
Ch. 3: Classic theory
- Bellman equations
- RPI, policy improv. theorem
- Value Iteration, Generalized Policy Iteration
- Temporal Difference, Q-learning, SARSA
- Eligibility Traces, TD-lambda, Retrace
Ch. 4: Value-based
- DQN
- Double DQN, Dueling DQN, PER, Noisy DQN, Multi-step DQN
- c51, QR-DQN, IQN, Rainbow DQN
Ch. 5: Policy Gradient
- REINFORCE, A2C, GAE
- TRPO, PPO
Ch. 6: Continuous Control
- DDPG, TD3
- SAC
Ch. 7: Model-based
- Bandits
- MCTS, AlphaZero, MuZero
- LQR
Ch. 8: Next Stage
- Imitation Learning / Inverse Reinforcement Learning
- Intrinsic Motivation
- Multi-Task and Hindsight
- Hierarchical RL
- Partial observability
- Multi-Agent RL

Owner

qbrick

qbrick

GitHub Repository

Unofficial PyTorch code for BasicVSR

Dependencies and Installation The code is based on BasicSR, Please install the BasicSR framework first. Pytorch=1.51 Training cd ./code CUDA_VISIBLE_

59 Dec 06, 2022

MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)

MetaShift: A Dataset of Datasets for Evaluating Distribution Shifts and Training Conflicts This repo provides the PyTorch source code of our paper: Me

88 Jan 04, 2023

Image De-raining Using a Conditional Generative Adversarial Network

Image De-raining Using a Conditional Generative Adversarial Network [Paper Link] [Project Page] He Zhang, Vishwanath Sindagi, Vishal M. Patel In this

216 Dec 18, 2022

An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

MixHop and N-GCN ⠀ A PyTorch implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019)

393 Dec 13, 2022

Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

1 Nov 25, 2022

PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

PyTorch Code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised Domain Adaptation Viraj Prabhu, Shivam Khare, Deeks

46 Dec 24, 2022

Supervised Contrastive Learning for Downstream Optimized Sequence Representations

SupCL-Seq 📖 Supervised Contrastive Learning for Downstream Optimized Sequence representations (SupCS-Seq) accepted to be published in EMNLP 2021, ext

18 Oct 21, 2022

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021 How to cite If you use these data please cite the o

2 Dec 20, 2021

OpenDILab Multi-Agent Environment

Go-Bigger: Multi-Agent Decision Intelligence Environment GoBigger Doc (中文版) Ongoing 2021.11.13 We are holding a competition —— Go-Bigger: Multi-Agent

441 Jan 05, 2023

Bridging Vision and Language Model

BriVL BriVL (Bridging Vision and Language Model) 是首个中文通用图文多模态大规模预训练模型。BriVL模型在图文检索任务上有着优异的效果，超过了同期其他常见的多模态预训练模型（例如UNITER、CLIP）。 BriVL论文：WenLan: Bridgi

235 Dec 27, 2022

Implementation of MA-Trace - a general-purpose multi-agent RL algorithm for cooperative environments.

Off-Policy Correction For Multi-Agent Reinforcement Learning This repository is the official implementation of Off-Policy Correction For Multi-Agent R

4 Aug 18, 2022

Source code for Zalo AI 2021 submission

zalo_ltr_2021 Source code for Zalo AI 2021 submission Solution: Pipeline We use the pipepline in the picture below: Our pipeline is combination of BM2

128 Dec 27, 2022

A basic duplicate image detection service using perceptual image hash functions and nearest neighbor search, implemented using faiss, fastapi, and imagehash

Duplicate Image Detection Getting Started Install dependencies pip install -r requirements.txt Run service python main.py Testing Test with pytest How

21 Nov 11, 2022

Implementation of Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

acLSTM_motion This folder contains an implementation of acRNN for the CMU motion database written in Pytorch. See the following links for more backgro

61 Sep 07, 2022

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models, under review at ICLR 2017 requirements: T

18 Mar 05, 2022

An unofficial PyTorch implementation of a federated learning algorithm, FedAvg.

Federated Averaging (FedAvg) in PyTorch An unofficial implementation of FederatedAveraging (or FedAvg) algorithm proposed in the paper Communication-E

123 Jan 06, 2023

Implementation of SegNet: A Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-Wise Labelling

Caffe SegNet This is a modified version of Caffe which supports the SegNet architecture As described in SegNet: A Deep Convolutional Encoder-Decoder A

1.1k Jan 02, 2023

Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Patch-Rotation(PatchRot) Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models Submitted to Neurips2021 To

4 Jul 12, 2021

Source Code of NeurIPS21 paper: Recognizing Vector Graphics without Rasterization

YOLaT-VectorGraphicsRecognition This repository is the official PyTorch implementation of our NeurIPS-2021 paper: Recognizing Vector Graphics without

49 Dec 20, 2022

Paaster is a secure by default end-to-end encrypted pastebin built with the objective of simplicity.

Follow the development of our desktop client here Paaster Paaster is a secure by default end-to-end encrypted pastebin built with the objective of sim

211 Dec 25, 2022

2022.PythonRepo

About
Contact Us
DMCA
Disclaimer
Privacy Policy