Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Last update: Dec 07, 2022

Overview

Phoenix-Drone-Simulation

An OpenAI Gym environment based on PyBullet for learning to control the CrazyFlie quadrotor:

Can be used for Reinforcement Learning (check out the examples!) or Model Predictive Control
We used this repository for sim-to-real transfer experiments (see publication [1] below)
The implemented dynamics model is based on the Bitcraze's Crazyflie 2.1 nano-quadrotor

Circle Task	TakeOff

The following tasks are currently available to fly the little drone:

Hover
Circle
Take-off (implemented but not yet working properly: reward function must be tuned!)
~~Reach~~ (not yet implemented)

Overview of Environments

	Task	Controller	Physics	Observation Frequency	Domain Randomization	Aerodynamic effects	Motor Dynamics
`DroneHoverSimpleEnv-v0`	Hover	PWM (100Hz)	Simple	100 Hz	10%	None	Instant force
`DroneHoverBulletEnv-v0`	Hover	PWM (100Hz)	PyBullet	100 Hz	10%	None	First-order
`DroneCircleSimpleEnv-v0`	Circle	PWM (100Hz)	Simple	100 Hz	10%	None	Instant force
`DroneCircleBulletEnv-v0`	Circle	PWM (100Hz)	PyBullet	100 Hz	10%	None	First-order
`DroneTakeOffSimpleEnv-v0`	Take-off	PWM (100Hz)	Simple	100 Hz	10%	Ground-effect	Instant force
`DroneTakeOffBulletEnv-v0`	Take-off	PWM (100Hz)	PyBullet	100 Hz	10%	Ground-effect	First-order

Installation and Requirements

Here are the (few) steps to follow to get our repository ready to run. Clone the repository and install the phoenix-drone-simulation package via pip. Note that everything after a $ is entered on a terminal, while everything after >>> is passed to a Python interpreter. Please, use the following three steps for installation:

$ git clone https://github.com/SvenGronauer/phoenix-drone-simulation
$ cd phoenix-drone-simulation/
$ pip install -e .

This package follows OpenAI's Gym Interface.

Note: if your default python is 2.7, in the following, replace pip with pip3 and python with python3

Supported Systems

We tested this package under Ubuntu 20.04 and Mac OS X 11.2 running Python 3.7 and 3.8. Other system might work as well but have not been tested yet. Note that PyBullet supports Windows as platform only experimentally!.

Dependencies

Bullet-Safety-Gym heavily depends on two packages:

Gym
PyBullet

Getting Started

After the successful installation of the repository, the Bullet-Safety-Gym environments can be simply instantiated via gym.make. See:

>>> import gym
>>> import phoenix_drone_simulation
>>> env = gym.make('DroneHoverBulletEnv-v0')

The functional interface follows the API of the OpenAI Gym (Brockman et al., 2016) that consists of the three following important functions:

>>> observation = env.reset()
>>> random_action = env.action_space.sample()  # usually the action is determined by a policy
>>> next_observation, reward, done, info = env.step(random_action)

A minimal code for visualizing a uniformly random policy in a GUI, can be seen in:

import gym
import time
import phoenix_drone_simulation

env = gym.make('DroneHoverBulletEnv-v0')

while True:
    done = False
    env.render()  # make GUI of PyBullet appear
    x = env.reset()
    while not done:
        random_action = env.action_space.sample()
        x, reward, done, info = env.step(random_action)
        time.sleep(0.05)

Note that only calling the render function before the reset function triggers visuals.

Training Policies

To train an agent with the PPO algorithm call:

$ python -m phoenix_drone_simulation.train --alg ppo --env DroneHoverBulletEnv-v0

This works with basically every environment that is compatible with the OpenAI Gym interface:

$ python -m phoenix_drone_simulation.train --alg ppo --env CartPole-v0

After an RL model has been trained and its checkpoint has been saved on your disk, you can visualize the checkpoint:

$ python -m phoenix_drone_simulation.play --ckpt PATH_TO_CKPT

where PATH_TO_CKPT is the path to the checkpoint, e.g. /var/tmp/sven/DroneHoverSimpleEnv-v0/trpo/2021-11-16__16-08-09/seed_51544

Examples

`generate_trajectories.py`

See the generate_trajectories.py script which shows how to generate data batches of size N. Use generate_trajectories.py --play to visualize the policy in PyBullet simulator.

`train_drone_hover.py`

Use Reinforcement Learning (RL) to learn the drone holding its position at (0, 0, 1). This canonical example relies on the RL-safety-Algorithms repository which is a very strong framework for parallel RL algorithm training.

`transfer_learning_drone_hover.py`

Shows a transfer learning approach. We first train a PPO model in the source domain DroneHoverSimpleEnv-v0 and then re-train the model on a more complex target domain DroneHoverBulletEnv-v0. Note that the DroneHoverBulletEnv-v0 environment builds upon an accurate motor modelling of the CrazyFlie drone and includes a motor dead time as well as a motor lag.

Tools

convert.py @ Sven Gronauer

A function used by Sven to extract the policy networks from his trained Actor Critic module and convert the model to a json file format.

Version History and Changes

Version	Changes	Date
v1.0	Public Release: Simulation parameters as proposed in Publication [1]	19.04.2022
v0.2	Add: accurate motor dynamic model and first real-world transfer insights	21.09.2021
v0.1	Re-factor: of repository (only Hover task yet implemented)	18.05.2021
v0.0	Fork: from Gym-PyBullet-Drones Repo	01.12.2020

Publications

Using Simulation Optimization to Improve Zero-shot Policy Transfer of Quadrotors

Sven Gronauer, Matthias Kissel, Luca Sacchetto, Mathias Korte, Klaus Diepold

https://arxiv.org/abs/2201.01369

Lastly, we want to thank:

Jacopo Panerati and his team for contributing the Gym-PyBullet-Drones Repo which was the staring point for this repository.
Artem Molchanov and collaborators for their hints about the CrazyFlie Firmware and the motor dynamics in their paper "Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors"
Jakob Foerster for this Bachelor Thesis and his insights about the CrazyFlie's parameter values

This repository has been develepod at the

Chair of Data Processing
TUM School of Computation, Information and Technology
Technical University of Munich

Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Related tags

Overview

Phoenix-Drone-Simulation

Overview of Environments

Installation and Requirements

Supported Systems

Dependencies

Getting Started

Training Policies

Examples

`generate_trajectories.py`

`train_drone_hover.py`

`transfer_learning_drone_hover.py`

Tools

Version History and Changes

Publications

Owner

Sven Gronauer

The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'

PyTorch Code of "Memory In Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity from Spatiotemporal Dynamics"

This repo generates the training data and the model for Morpheus-Deblend

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

A PyTorch implementation of QANet.

paper: Hyperspectral Remote Sensing Image Classification Using Deep Convolutional Capsule Network

Towards End-to-end Video-based Eye Tracking

Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Implement some metaheuristics and cost functions

BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation

Solutions and questions for AoC2021. Merry christmas!

CS506-Spring2022 - Code and Slides for Boston University CS 506

This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.

A curated list of awesome Deep Learning tutorials, projects and communities.

This repository contains code from the paper "TTS-GAN: A Transformer-based Time-Series Generative Adversarial Network"

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification