Model-based reinforcement learning in TensorFlow

Last update: Nov 09, 2022

Overview

Bellman

Website | Twitter | Documentation (latest)

What does Bellman do?

Bellman is a package for model-based reinforcement learning (MBRL) in Python, using TensorFlow and building on top of model-free reinforcement learning package TensorFlow Agents.

Bellman provides a framework for flexible composition of model-based reinforcement learning algorithms. It offers two major classes of algorithms: decision time planning and background planning algorithms. With each class any kind of supervised learning method can be easily used to learn certain component of the environment. Bellman was designed with modularity in mind - important components can be flexibly combined, such as type of decision time planning method (e.g. a cross entropy method or a random shooting method) and type of model for state transition (e.g. a probabilistic neural network or an ensemble of neural networks). Bellman also provides implementation of several popular state-of-the-art MBRL algorithms, such as PETS, MBPO and METRPO. The online documentation (latest) contains more details.

Bellman requires Python 3.7 onwards and uses TensorFlow 2.4+ for running computations, which allows fast execution on GPUs.

Maintainers

Bellman was originally created by (in alphabetical order) Vincent Adam, Jordi Grau-Moya, Felix Leibfried, John A. McLeod, Hrvoje Stojic, and Peter Vrancx, at Secondmind Labs.

It is now actively maintained by (in alphabetical order) Felix Leibfried, John A. McLeod, Hrvoje Stojic, and Peter Vrancx.

Bellman is an open source project. If you have relevant skills and are interested in contributing then please do contact us (see "The Bellman Community" section below).

We are very grateful to our Secondmind Labs colleagues, maintainers of GPflow and Trieste in particular, for their help with creating contributing guidelines, instructions for users and open-sourcing in general.

Install Bellman

For users

For latest (stable) release from PyPI you can use pip to install the toolbox

$ pip install bellman

Use pip to install the toolbox from latest source from GitHub. Check-out the develop branch of the Bellman GitHub repository, and in the repository root run

$ pip install -e .

This will install the toolbox in editable mode.

For contributors

If you wish to contribute please use Poetry to manage dependencies in a local virtual environment. Poetry configuration file specifies all the development dependencies (testing, linting, typing, docs etc) and makes it much easier to contribute. To install Poetry, follow the instructions in the Poetry documentation.

To install this project in editable mode, run the commands below from the root directory of the bellman repository.

poetry install

This command creates a virtual environment for this project in a hidden .venv directory under the root directory. You can easily activate it with

poetry shell

You must also run the poetry install command to install updated dependencies when the pyproject.toml file is updated, for example after a git pull.

Installing MuJoCo (Optional)

Many benchmarks in continuous control in MBRL use the MuJoCo physics engine. Some of the TF-Agents examples have been tested against Mujoco environments as well. MuJoCo is proprietary software that requires a license (see MuJoCo website). As a result installing it is optional, but because of its importance to the research community it is highly recommended. Don't worry if you decide not to install MuJoCo though, all our examples and notebooks rely on standard environments available in OpenAI Gym.

We interface with MuJoCo through a python library mujoco-py via OpenAI Gym (mujoco-py github page). Check the installation instructions there on how to install MuJoCo. Note that you should install MuJoCo 1.5 since OpenAI Gym supports that version. After that you can install mujoco-py library with an additional Poetry command:

poetry install -E mujoco-py

If this command fails, please check troubleshooting sections at mujoco-py github page, you might need to satisfy other mujoco-py dependencies (e.g. Linux system libraries) or set some environment variables.

The Bellman Community

Getting help

Bugs, feature requests, pain points, annoying design quirks, etc: Please use GitHub issues to flag up bugs/issues/pain points, suggest new features, and discuss anything else related to the use of Bellman that in some sense involves changing the Bellman code itself. We positively welcome comments or concerns about usability, and suggestions for changes at any level of design. We aim to respond to issues promptly, but if you believe we may have forgotten about an issue, please feel free to add another comment to remind us.

"How-to-use" questions: Please use Stack Overflow (Bellman tag) to ask questions that relate to "how to use Bellman", i.e. questions of understanding rather than issues that require changing Bellman code. (If you are unsure where to ask, you are always welcome to open a GitHub issue; we may then ask you to move your question to Stack Overflow.)

Slack workspace

We have a public Bellman slack workspace. Please use this invite link if you'd like to join, whether to ask short informal questions or to be involved in the discussion and future development of Bellman.

Contributing

All constructive input is very much welcome. For detailed information, see the guidelines for contributors.

Citing Bellman

To cite Bellman, please reference our arXiv paper where we review the framework and describe the design. Sample Bibtex is given below:

@article{bellman2021,
    author = {McLeod, John and Stojic, Hrvoje and Adam, Vincent and Kim, Dongho and Grau-Moya, Jordi and Vrancx, Peter and Leibfried, Felix},
    title = {Bellman: A Toolbox for Model-based Reinforcement Learning in TensorFlow},
    year = {2021},
    journal = {arXiv:2103.14407},
    url = {https://arxiv.org/abs/2103.14407}
}

License

Apache License 2.0

Comments

Dongho/tensorflow 2.5
PR type: bugfix / enhancement / new feature / doc improvement

Related issue(s)/PRs:

Summary

Proposed changes

Quick fix setup.py to version up tensorflow and other related packages

What alternatives have you considered?

Minimal working example

PR checklist

[ ] New features: code is well-documented

[ ] detailed docstrings (API documentation)

[ ] notebook examples (usage demonstration)

[ ] The bug case / new feature is covered by unit tests

[ ] Code has type annotations

[ ] I ran the black+isort formatter

[ ] I locally tested that the tests pass

Release notes

Fully backwards compatible: yes

If not, why is it worth breaking backwards compatibility:

Commit message (for release notes):

Quick fix for setup.py
opened by dongho-kim 1
setting things up for pypi
2 things I would need some help with:

pyproject.toml - [build-system] currently points to poetry, is that fine for building a package for pip?

I'm not convinced we need all the libraries listed in install_requires in setup.py - @johnamcleod you were taking care of dependencies before, can you give a hand here please?

I have set up a workflow for psuhing things to PyPi automatically, not sure how to test it though (hm, perhaps I could modfiy it to use test PyPi...) I will first push things to test PyPi, to verify things work as intended
enhancement
opened by hstojic 1
Dongho/tensorflow 2.5
PR type: enhancement

Related issue(s)/PRs:

Summary

Proposed changes

Support tensorflow 2.5, tf-agents 0.8.0 and tensorflow-probability 0.12.2

Fixes for test errors which possibly occurs on Mac (inc. Apple Silicon) environment

What alternatives have you considered?

Minimal working example

NA as no new features added

PR checklist

[ ] New features: code is well-documented

[ ] detailed docstrings (API documentation)

[ ] notebook examples (usage demonstration)

[ ] The bug case / new feature is covered by unit tests

[ ] Code has type annotations

[ ] I ran the black+isort formatter

[X] I locally tested that the tests pass

Release notes

Fully backwards compatible: no

If not, why is it worth breaking backwards compatibility:

Changes in TFAgent.init introduced in later tf-agents seem to break backwards compatibility, causing errors when we pass TRAIN_ARGSPEC. However this is worth breaking due to the security vulnerability in tensorflow 2.4.0.

Commit message (for release notes):

Support tensorflow 2.5, tf-agents 0.8.0 and tensorflow-probability 0.12.2

enhancement good first issue
opened by dongho-kim 0
Add MBPO train_eval function
PR type: enhancement

Related issue(s)/PRs: fix #24

Summary

Proposed changes The MBPO agent does not have a train_eval function in the benchmark package. This PR fixes that.

What alternatives have you considered?

Minimal working example

Look at the run_mbpo example.

Release notes

Fully backwards compatible: yes If not, why is it worth breaking backwards compatibility:

Commit message (for release notes):

Add a train_eval function for the MBPO agent.

enhancement
opened by johnamcleod 0
John/fix none loss in harness
PR type: bugfix

**Related issue(s)/PRs: N/A

Summary

Proposed changes There is an integration issue between the TFTrainingScheduler and the ExperimentHarness where if the call to the agent trainer's train_step method returns None for the loss, the harness throws an exception when trying to write the logs. This situation can occur when insufficiently many environment steps have passed to train a model-free agent component of a model-based agent.

This PR addresses the issue by intercepting the None loss from the agent trainer in the scheduler and not adding it to the training_info dictionary.

Minimal working example

The run_mbpo example hits this problem on the first environment time step.

PR checklist

[ ] New features: code is well-documented

[ ] detailed docstrings (API documentation)

[ ] notebook examples (usage demonstration)

[x] The bug case / new feature is covered by unit tests

[x] Code has type annotations

[x] I ran the black+isort formatter

[x] I locally tested that the tests pass

Release notes

Fully backwards compatible: yes

If not, why is it worth breaking backwards compatibility:

Commit message (for release notes):

...

bug
opened by johnamcleod 0
upload-pypi.yaml fails on `main`

GH action fails on "Verify git tag vs. VERSION" step, $GITHUB_REF env variable seems to come with refs/tags/ bit pre-pended, which code does not allow for - here is a solution: https://github.community/t/how-to-get-just-the-tag-name/16241
bug

opened by hstojic 0
Release/0.1.0

updated develop with few small corrections for merging into main as a (pre-)release 0.1.0 it seems we can then create a release out of that version of main on GH with a description of the changelog. That should create a tag.
release

opened by hstojic 0
Hstojic/trigger docs
modified a github action to trigger generating documentation in the website repo instead action sends an event that an action in website repo is listening to tested and it seems to work, check https://belman.dev/docs

see:

https://docs.github.com/en/actions/reference/events-that-trigger-workflows#external-events-repository_dispatch

https://docs.github.com/en/rest/reference/repos#create-a-repository-dispatch-event

https://docs.github.com/en/developers/webhooks-and-events/webhook-events-and-payloads#repository_dispatch

documentation enhancement
opened by hstojic 0
Felix/initial commit
PR type: bugfix / enhancement / new feature / doc improvement

Related issue(s)/PRs:

Summary

Proposed changes

...

...

...

What alternatives have you considered?

Minimal working example

# Put your example code in here

PR checklist

[ ] New features: code is well-documented

[ ] detailed docstrings (API documentation)

[ ] notebook examples (usage demonstration)

[ ] The bug case / new feature is covered by unit tests

[ ] Code has type annotations

[ ] I ran the black+isort formatter

[ ] I locally tested that the tests pass

Release notes

Fully backwards compatible: yes / no

If not, why is it worth breaking backwards compatibility:

Commit message (for release notes):

...
opened by fleibfried 0
poetry task check_requirements

Feature request

Different from the description in CONTRIBUTING.md, it doesn't seem that we can run poetry run task check_requirements as the task doesn't seem to be defined anywhere. Would be great to add this feature back.

Motivation

Is your feature request related to a problem?

It is unclear how to automatically update setup.py when we update poetry.

Proposal

Describe the solution you would like

What alternatives have you considered?

Are you willing to open a pull request? (We really appreciate contributions!)

Additional context
enhancement

opened by dongho-kim 0

Releases(v0.1.0)

v0.1.0(Apr 7, 2021)
First release, 0.1.0

(well, a pre-release actually :)

What is Bellman?

Bellman is a package for model-based reinforcement learning (MBRL) in Python, using TensorFlow 2.4+ and building on top of model-free reinforcement learning package TensorFlow Agents.

Main features

A framework for flexible composition of model-based reinforcement learning algorithms.

It offers modular components for composing two major classes of algorithms:

decision time planning

background planning

Keras neural networks for modeling transition dynamics

Rewards, termination and initial state distributions are assumed to be known for now

Implementations of several state-of-the-art model-based algorithms (PETS, MBPO and METRPO) and one model-free algorithm (TRPO)

Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository https://bellman.dev

Entity-Based Knowledge Conflicts in Question Answering.

Entity-Based Knowledge Conflicts in Question Answering Run Instructions | Paper | Citation | License This repository provides the Substitution Framewo

35 Oct 19, 2022

Neural Re-rendering for Full-frame Video Stabilization

NeRViS: Neural Re-rendering for Full-frame Video Stabilization Project Page | Video | Paper | Google Colab Setup Setup environment for [Yu and Ramamoo

9 Jun 17, 2022

ML From Scratch

ML from Scratch MACHINE LEARNING TOPICS COVERED - FROM SCRATCH Linear Regression Logistic Regression K Means Clustering K Nearest Neighbours Decision

66 Nov 02, 2022

Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Omninet - Pytorch Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch. The authors propose that we should be atte

48 Nov 21, 2022

A treasure chest for visual recognition powered by PaddlePaddle

简体中文 | English PaddleClas 简介飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集，助力使用者训练出更好的视觉模型和应用落地。近期更新 2021.11.1 发布PP-ShiTu技术报告，新增饮料识别demo 2021.10.23 发

4.6k Dec 31, 2022

DeepStochlog Package For Python

DeepStochLog Installation Installing SWI Prolog DeepStochLog requires SWI Prolog to run. Run the following commands to install: sudo apt-add-repositor

17 Dec 23, 2022

The UI as a mobile display for OP25

OP25 Mobile Control Head A 'remote' control head that interfaces with an OP25 instance. We take advantage of some data end-points left exposed for the

13 Dec 28, 2022

Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

1.3k Jan 02, 2023

Red Team tool for exfiltrating files from a target's Google Drive that you have access to, via Google's API.

GD-Thief Red Team tool for exfiltrating files from a target's Google Drive that you(the attacker) has access to, via the Google Drive API. This includ

39 Dec 27, 2022

Data from "HateCheck: Functional Tests for Hate Speech Detection Models" (Röttger et al., ACL 2021)

In this repo, you can find the data from our ACL 2021 paper "HateCheck: Functional Tests for Hate Speech Detection Models". "test_suite_cases.csv" con

43 Nov 11, 2022

My implementation of transformers related papers for computer vision in pytorch

vision_transformers This is my personnal repo to implement new transofrmers based and other computer vision DL models I am currenlty working without a

1 Nov 10, 2021

A testcase generation tool for Persistent Memory Programs.

PMFuzz PMFuzz is a testcase generation tool to generate high-value tests cases for PM testing tools (XFDetector, PMDebugger, PMTest and Pmemcheck) If

14 Jul 24, 2022

Locally Constrained Self-Attentive Sequential Recommendation

LOCKER This is the pytorch implementation of this paper: Locally Constrained Self-Attentive Sequential Recommendation. Zhankui He, Handong Zhao, Zhe L

8 Jul 30, 2022

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

21 Nov 20, 2022

Learning where to learn - Gradient sparsity in meta and continual learning

Learning where to learn - Gradient sparsity in meta and continual learning In this paper, we investigate gradient sparsity found by MAML in various co

28 Dec 09, 2022

Deep Learning for Time Series Forecasting.

nixtlats:Deep Learning for Time Series Forecasting [nikstla] (noun, nahuatl) Period of time. State-of-the-art time series forecasting for pytorch. Nix

5 Dec 06, 2022

Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"

CLIPstyler Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition" Environment Pytorch 1.7.1, Python 3.6 $ c

201 Dec 29, 2022

Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation (ICCV 2021) [中文|EN] 概述本工作主要探索一种高效的多传感器（激光雷达和摄像头）融合点云语义分割方法。现有的多传感器融合方法主要将点云投影

126 Dec 30, 2022

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

ClusterGCN ⠀⠀ A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019). A

697 Dec 27, 2022

Source Code For Template-Based Named Entity Recognition Using BART

Template-Based NER Source Code For Template-Based Named Entity Recognition Using BART Training Training train.py Inference inference.py Corpus ATIS (h

174 Dec 19, 2022

Model-based reinforcement learning in TensorFlow

Related tags

Overview

Bellman

What does Bellman do?

Maintainers

Install Bellman

For users

For contributors

Installing MuJoCo (Optional)

The Bellman Community

Getting help

Slack workspace

Contributing

Citing Bellman

License

Comments

Summary

Minimal working example

PR checklist

Release notes

Summary

Minimal working example

PR checklist

Release notes

Summary

Minimal working example

Release notes

Summary

Minimal working example

PR checklist

Release notes

Summary

Minimal working example

PR checklist

Release notes

Feature request

Motivation

Proposal

Additional context

Releases(v0.1.0)

v0.1.0(Apr 7, 2021)

First release, 0.1.0

What is Bellman?

Main features

Owner

Entity-Based Knowledge Conflicts in Question Answering.

Neural Re-rendering for Full-frame Video Stabilization

ML From Scratch

Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

A treasure chest for visual recognition powered by PaddlePaddle

DeepStochlog Package For Python

The UI as a mobile display for OP25

Metric learning algorithms in Python

Red Team tool for exfiltrating files from a target's Google Drive that you have access to, via Google's API.

Data from "HateCheck: Functional Tests for Hate Speech Detection Models" (Röttger et al., ACL 2021)

My implementation of transformers related papers for computer vision in pytorch

A testcase generation tool for Persistent Memory Programs.

Locally Constrained Self-Attentive Sequential Recommendation

Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Learning where to learn - Gradient sparsity in meta and continual learning

Deep Learning for Time Series Forecasting.

Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"

Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

Source Code For Template-Based Named Entity Recognition Using BART