Skip to content

Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

ischubert/l2e

Repository files navigation

Learning to Execute (L2E)

Official code base for completely reproducing all results reported in

I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Installation

Initialize submodules:

git submodule init
git submodule update

Install rai-python

For rai-python, it is recommended to use this docker image.

If you want to install rai-python manually, follow instructions here. You will also need to install PhysX, ideally following these instructions.

Install gym-physx

Modify the path to rai-python/rai/rai/ry in gym-physx/gym_physx/envs/physx_pushing_env.py depending on your installation. Then install gym-physx using pip:

cd gym-physx
pip install .

Install gym-obstacles

In case you also want to run the 2D maze example with moving obstacles as introduced in section A.3, install gym-obstacles:

cd gym-obstacles
pip install .

Install our fork of stable-baselines3

cd stable-baselines3
pip install .

Reproduce figures

l2e/l2e/ contains code to reproduce the reults in the paper.

Figures consist of multiple experiments and are defined in plot_results.json.

Experiments are defined in config_$EXPERIMENT.json.

Intermediate and final results are saved to $scratch_root/$EXPERIMENT/ (configure $scratch_root in each config_$EXPERIMENT.json as well as in plot_results.json).

Step-by-step instructions to reproduce figures:

  1. Depending on experiment, use the following train scripts:

    1. For the RL runs ($EXPERIMENT=l2e* and $EXPERIMENT=her*)

      ./train.sh $EXPERIMENT
    2. For the Inverse Model runs ($EXPERIMENT=im_plan_basic and $EXPERIMENT=im_plan_obstacle_training)

      First collect data:

      ./imitation_data.sh $EXPERIMENT

      Then train inverse model

      ./imitation_learning.sh $EXPERIMENT
    3. For the Direct Execution runs ($EXPERIMENT=plan_basic and $EXPERIMENT=plan_obstacle)

      No training stage is needed here.

    ./train.sh $EXPERIMENT will launch multiple screens with multiple independent runs of $EXPERIMENT. The number of runs is configured using $AGENTS_MIN and $AGENTS_MAX in config_$EXPERIMENT.json.

    ./imitation_data.sh will launch $n_data_collect_workers workers for collecting data, and ./imitation_learning.sh will launch $n_training_workers runs training models independently.

  2. Evaluate results

    ./evaluate.sh $EXPERIMENT

    python evaluate.py $EXPERIMENT will launch multiple screens, one for each agent that was trained in step 1. python evaluate.py $EXPERIMENT will automatically scan for new training output, and only evaluate model checkpoints that haven't been evaluated yet.

  3. Plot results

    After all experiments are finished, create plots using

    python plot_results.py

    This will create all data figures contained in the paper. Figures are saved in l2e/figs/ (configure in plot_results.json)

About

Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published