Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Overview

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

This repository hosts the code related to the paper:

Marco Rosano, Antonino Furnari, Luigi Gulino, Corrado Santoro and Giovanni Maria Farinella, "Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation". Submitted to "Robotics and Autonomous Systems" (RAS), 2022.

For more details please see the project web page at https://iplab.dmi.unict.it/EmbodiedVN.

Overview

This code is built on top of the Habitat-api/Habitat-lab project. Please see the Habitat project page for more details.

This repository provides the following components:

  1. The implementation of the proposed tool, integrated with Habitat, to train visual navigation models on synthetic observations and test them on realistic episodes containing real-world images. This allows the estimation of real-world performance, avoiding the physical deployment of the robotic agent;

  2. The official PyTorch implementation of the proposed visual navigation models, which follow different strategies to combine a range of visual mid-level representations

  3. the synthetic 3D model of the proposed environment, acquired using the Matterport 3D scanner and used to perform the navigation episodes at train and test time;

  4. the photorealistic 3D model that contains real-world images of the proposed environment, labeled with their pose (X, Z, Angle). The sparse 3D reconstruction was performed using the COLMAP Structure from Motion tool, to then be aligned with the Matterport virtual 3D map.

  5. An integration with CycleGAN to train and evaluate navigation models with Habitat on sim2real adapted images.

  6. The checkpoints of the best performing navigation models.

Installation

Requirements

  • Python >= 3.7, use version 3.7 to avoid possible issues.
  • Other requirements will be installed via pip in the following steps.

Steps

  1. (Optional) Create an Anaconda environment and install all on it ( conda create -n fusion-habitat python=3.7; conda activate fusion-habitat )

  2. Install the Habitat simulator following the official repo instructions .The development and testing was done on commit bfbe9fc30a4e0751082824257d7200ad543e4c0e, installing the simulator "from source", launching the ./build.sh --headless --with-cuda command (guide). Please consider to follow these suggestions if you encounter issues while installing the simulator.

  3. Install the customized Habitat-lab (this repo):

    git clone https://github.com/rosanom/mid-level-fusion-nav.git
    cd mid-level-fusion-nav/
    pip install -r requirements.txt
    python setup.py develop --all # install habitat and habitat_baselines
    
  4. Download our dataset (journal version) from here, and extract it to the repository folder (mid-level-fusion-nav/). Inside the data folder you should see this structure:

    datasets/pointnav/orangedev/v1/...
    real_images/orangedev/...
    scene_datasets/orangedev/...
    orangedev_checkpoints/...
    
  5. (Optional, to check if the software works properly) Download the test scenes data and extract the zip file to the repository folder (mid-level-fusion-nav/). To verify that the tool was successfully installed, run python examples/benchmark.py or python examples/example.py.

Data Structure

All data can be found inside the mid-level-fusion-nav/data/ folder:

  • the datasets/pointnav/orangedev/v1/... folder contains the generated train and validation navigation episodes files;
  • the real_images/orangedev/... folder contains the real world images of the proposed environment and the csv file with their pose information (obtained with COLMAP);
  • the scene_datasets/orangedev/... folder contains the 3D mesh of the proposed environment.
  • orangedev_checkpoints/ is the folder where the checkpoints are saved during training. Place the checkpoint file here if you want to restore the training process or evaluate the model. The system will load the most recent checkpoint file.

Config Files

There are two configuration files:

habitat_domain_adaptation/configs/tasks/pointnav_orangedev.yaml

and

habitat_domain_adaptation/habitat_baselines/config/pointnav/ddppo_pointnav_orangedev.yaml.

In the first file you can change the robot's properties, the sensors used by the agent and the dataset used in the experiment. You don't have to modify it.

In the second file you can decide:

  1. if evaluate the navigation models using RGB or mid-level representations;
  2. the set of mid-level representations to use;
  3. the fusion architecture to use;
  4. if train or evaluate the models using real images, or using the CycleGAN sim2real adapted observations.
...
EVAL_W_REAL_IMAGES: True
EVAL_CKPT_PATH_DIR: "data/orangedev_checkpoints/"

SIM_2_REAL: False #use cycleGAN for sim2real image adaptation?

USE_MIDLEVEL_REPRESENTATION: True
MIDLEVEL_PARAMS:
ENCODER: "simple" # "simple", SE_attention, "mid_fusion", ...
FEATURE_TYPE: ["normal"] #["normal", "keypoints3d","curvature", "depth_zbuffer"]
...

CycleGAN Integration (baseline)

In order to use CycleGAN on Habitat for the sim2real domain adaptation during train or evaluation, follow the steps suggested in the repository of our previous resease.

Train and Evaluation

To train the navigation model using the DD-PPO RL algorithm, run:

sh habitat_baselines/rl/ddppo/single_node_orangedev.sh

To evaluate the navigation model using the DD-PPO RL algorithm, run:

sh habitat_baselines/rl/ddppo/single_node_orangedev_eval.sh

For more information about DD-PPO RL algorithm, please check out the habitat-lab dd-ppo repo page.

License

The code in this repository, the 3D models and the images of the proposed environment are MIT licensed. See the LICENSE file for details.

The trained models and the task datasets are considered data derived from the correspondent scene datasets.

Acknowledgements

This research is supported by OrangeDev s.r.l, by Next Vision s.r.l, the project MEGABIT - PIAno di inCEntivi per la RIcerca di Ateneo 2020/2022 (PIACERI) – linea di intervento 2, DMI - University of Catania, and the grant MIUR AIM - Attrazione e Mobilità Internazionale Linea 1 - AIM1893589 - CUP E64118002540007.

Owner
First Person Vision @ Image Processing Laboratory - University of Catania
First Person Vision @ Image Processing Laboratory - University of Catania
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

Xuebin Qin 6.5k Jan 09, 2023
Code for Emergent Translation in Multi-Agent Communication

Emergent Translation in Multi-Agent Communication PyTorch implementation of the models described in the paper Emergent Translation in Multi-Agent Comm

Facebook Research 75 Jul 15, 2022
A transformer which can randomly augment VOC format dataset (both image and bbox) online.

VocAug It is difficult to find a script which can augment VOC-format dataset, especially the bbox. Or find a script needs complex requirements so it i

Coder.AN 1 Mar 05, 2022
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient (paper) @misc{zhang2021compress,

46 Dec 07, 2022
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Sicheng Xu 833 Dec 28, 2022
Code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”

GATER This repository contains the code for our EMNLP 2021 paper “Heterogeneous Graph Neural Networks for Keyphrase Generation”. Our implementation is

Jiacheng Ye 12 Nov 24, 2022
Tesla Light Show xLights Guide With python

Tesla Light Show xLights Guide Welcome to the Tesla Light Show xLights guide! You can create and run your own light shows on Tesla vehicles. Running a

Tesla, Inc. 2.5k Dec 29, 2022
[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

GenForce: May Generative Force Be with You 46 Dec 19, 2022
OBBDetection is a oriented object detection library, which is based on MMdetection.

OBBDetection news: We are now updating OBBDetection to new vision based on MMdetection v2.10, which has more advanced models and more efficient featur

jbwang1997 401 Jan 02, 2023
Meta Learning for Semi-Supervised Few-Shot Classification

few-shot-ssl-public Code for paper Meta-Learning for Semi-Supervised Few-Shot Classification. [arxiv] Dependencies cv2 numpy pandas python 2.7 / 3.5+

Mengye Ren 501 Jan 08, 2023
Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Xiangyin Kong 7 Nov 08, 2022
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Antoine Caillon 589 Jan 02, 2023
implementation for paper "ShelfNet for fast semantic segmentation"

ShelfNet-lightweight for paper (ShelfNet for fast semantic segmentation) This repo contains implementation of ShelfNet-lightweight models for real-tim

Juntang Zhuang 252 Sep 16, 2022
《K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters》(2020)

K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters This repository is the implementation of the paper "K-Adapter: Infusing Knowledge

Microsoft 118 Dec 13, 2022
Unofficial Implementation of Oboe (SIGCOMM'18').

Oboe-Reproduce This is the unofficial implementation of the paper "Oboe: Auto-tuning video ABR algorithms to network conditions, Zahaib Akhtar, Yun Se

Tianchi Huang 13 Nov 04, 2022
RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

LightOn 69 Dec 22, 2022
[CVPR2021] De-rendering the World's Revolutionary Artefacts

De-rendering the World's Revolutionary Artefacts Project Page | Video | Paper In CVPR 2021 Shangzhe Wu1,4, Ameesh Makadia4, Jiajun Wu2, Noah Snavely4,

49 Nov 06, 2022
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark We propose a benchmark to evaluate different quantization algorithms on vari

494 Dec 29, 2022
Scenarios, tutorials and demos for Autonomous Driving

The Autonomous Driving Cookbook (Preview) NOTE: This project is developed and being maintained by Project Road Runner at Microsoft Garage. This is cur

Microsoft 2.1k Jan 02, 2023
This repository accompanies the ACM TOIS paper "What can I cook with these ingredients?" - Understanding cooking-related information needs in conversational search

In this repository you find data that has been gathered when conducting in-situ experiments in a conversational cooking setting. These data include tr

6 Sep 22, 2022