Code for Multimodal Neural SLAM for Interactive Instruction Following

Overview

Code for Multimodal Neural SLAM for Interactive Instruction Following

Code structure

The code is adapted from E.T. and most training as well as data processing files are in currently in the ET/notebooks folder and the et_train folder.

Dependency

Inherited from the E.T. repo, the package is depending on:

  • numpy
  • pandas
  • opencv-python
  • tqdm
  • vocab
  • revtok
  • numpy
  • Pillow
  • sacred
  • etaprogress
  • scikit-video
  • lmdb
  • gtimer
  • filelock
  • networkx
  • termcolor
  • torch==1.7.1
  • torchvision==0.8.2
  • tensorboardX==1.8
  • ai2thor==2.1.0
  • E.T. (https://github.com/alexpashevich/E.T.)

MaskRCNN Fine-tuning

To fine-tune the MaskRCNN module used in solving the Alfred challenge, we provide the code adapted from the official PyTorch tutorial.

Setup

We assume the environment and the code structure as in the E.T. model is set up, with this repo served as an extension. Although the fine-tuning code should be a standalone unit.

Training Data Geneation

Given a traj_data.json file (e.g., the 45K one used in E.T. joint-training here), run python -m alfred.gen.render_trajs as in E.T. to render the training inputs (raw images) and the ground truth labels (instance segmentation masks) for all the frames recorded in the traj_data.json files. Make sure the flag for generating instance level segmentation masks is set to True.

Pre-processing Instance Segmentation Masks

The rendered instance segmentation masks need to be preprocessed so that the data format is aligned with the one used in the official PyTorch tutorial. In specific, each generated mask is of a different RGB color per instance, which is mapped to the unique instance index in the frame as well as a label index for its semantic class. The mapping is constructed by looking up the traj['scene']['color_to_object_type'] in each of the json dictionaries. The code also supports the functionality to only collect training data from certain subgoal data (such as for PickupObject in Alfred). Notice that there are some bugs in the rendering process of the masks which creates some artifacts (small regions in the ground truth labels that correspond to no actual objects). This can be fixed by only selecting instance masks that are larger than certain area (e.g., > 10 as in alfred/data/maskrcnn.py).

Training

Run python -m alfred.maskrcnn.train which first loads the pre-trained model provided by E.T. and then fine-tunes it on the pre-processed data mentioned above.

Evaluation

We follow the MSCOCO evaluation protocal which is widely used for object detection and instance segmentation, which output average precision and recall at multiple scales. The evaluation function call evaluate(model, data_loader_test, device=device) in alfred/maskrcnn/train.py serves as an example.

Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Unseen Object Amodal Instance Segmentation (UOAIS) Seunghyeok Back, Joosoon Lee, Taewon Kim, Sangjun Noh, Raeyoung Kang, Seongho Bak, Kyoobin Lee This

GIST-AILAB 92 Dec 13, 2022
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences This repository is an official PyTorch implementation of Neighbor

DIVE Lab, Texas A&M University 8 Jun 12, 2022
Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees Implementation of the paper titled "Using Sampling to

MIDAS, IIIT Delhi 2 Aug 29, 2022
This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

Elaborative Rehearsal for Zero-shot Action Recognition This is an official implementation of: Shizhe Chen and Dong Huang, Elaborative Rehearsal for Ze

DeLightCMU 26 Sep 24, 2022
《Dual-Resolution Correspondence Network》(NeurIPS 2020)

Dual-Resolution Correspondence Network Dual-Resolution Correspondence Network, NeurIPS 2020 Dependency All dependencies are included in asset/dualrcne

Active Vision Laboratory 45 Nov 21, 2022
Code for the paper "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Jukebox Code for "Jukebox: A Generative Model for Music" Paper Blog Explorer Colab Insta

OpenAI 6k Jan 02, 2023
GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs [Paper, Slides, Video Talk] at USENIX OSDI'21 @inproceedings{GNNAdvisor, title=

YUKE WANG 47 Jan 03, 2023
OpenCV, MediaPipe Pose Estimation, Affine Transform for Icon Overlay

Yoga Pose Identification and Icon Matching Project Goal Detect yoga poses performed by a user and overlay a corresponding icon image. Running the main

Anna Garverick 1 Dec 03, 2021
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

DV Lab 137 Dec 14, 2022
Python Implementation of Chess Playing AI with variable difficulty

Chess AI with variable difficulty level implemented using the MiniMax AB-Pruning Algorithm

Ali Imran 7 Feb 20, 2022
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022
Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"

The Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more" Arxiv preprint Louay Hazami   ·   Rayhane Mama   ·   Ragavan Thurairatn

Rayhane Mama 144 Dec 23, 2022
Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Yaoming Cai 5 Jul 18, 2022
This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation.

ISL This is the official pytorch implementation for the paper: Instance Similarity Learning for Unsupervised Feature Representation, which is accepted

19 May 04, 2022
Classify bird species based on their songs using SIamese Networks and 1D dilated convolutions.

The goal is to classify different birds species based on their songs/calls. Spectrograms have been extracted from the audio samples and used as features for classification.

Aditya Dutt 9 Dec 27, 2022
Repository for paper "Non-intrusive speech intelligibility prediction from discrete latent representations"

Non-Intrusive Speech Intelligibility Prediction from Discrete Latent Representations Official repository for paper "Non-Intrusive Speech Intelligibili

Alex McKinney 5 Oct 25, 2022
Implementation of ICCV21 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers

Implementation of ICCV 2021 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers arxiv This repository is based on detr Recently, DETR

twang 113 Dec 27, 2022
[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

RCIL [CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation Chang-Bin Zhang1, Jia-Wen Xiao1, Xialei Liu1, Ying-Cong Chen2

Chang-Bin Zhang 71 Dec 28, 2022
This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"

Differentiable Volumetric Rendering Paper | Supplementary | Spotlight Video | Blog Entry | Presentation | Interactive Slides | Project Page This repos

697 Jan 06, 2023
Automatic Calibration for Non-repetitive Scanning Solid-State LiDAR and Camera Systems

ACSC Automatic extrinsic calibration for non-repetitive scanning solid-state LiDAR and camera systems. System Architecture 1. Dependency Tested with U

KINO 192 Dec 13, 2022