Rank 1st in the public leaderboard of ScanRefer (2021-03-18)

Overview

InstanceRefer

InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring

This repository is for the 1st method on ScanRefer benchmark [arxiv paper].

Zhihao Yuan, Xu Yan, Yinghong Liao, Ruimao Zhang, Zhen Li*, Shuguang Cui

If you find our work useful in your research, please consider citing:

@InProceedings{yuan2021instancerefer,
  title={InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring},
  author={Zhihao Yuan, Xu Yan, Yinghong Liao, Ruimao Zhang, Zhen Li, Shuguang Cui},
  journal={arXiv preprint},
  year={2021}
}

News

  • 2021-03-31 We release InstanceRefer v1 πŸš€ !
  • 2021-03-18 We achieve 1st place in ScanRefer leaderboard πŸ”₯ .

Getting Started

Setup

The code is tested on Ubuntu 16.04 LTS & 18.04 LTS with PyTorch 1.3.0 CUDA 10.1 installed.

conda install pytorch==1.3.0 cudatoolkit=10.1 -c pytorch

Install the necessary packages listed out in requirements.txt:

pip install -r requirements.txt

After all packages are properly installed, please run the following commands to compile the torchsaprse:

cd lib/torchsparse/
python setup.py install

Before moving on to the next step, please don't forget to set the project root path to the CONF.PATH.BASE in lib/config.py.

Data preparation

  1. Download the ScanRefer dataset and unzip it under data/.
  2. Downloadand the preprocessed GLoVE embeddings (~990MB) and put them under data/.
  3. Download the ScanNetV2 dataset and put (or link) scans/ under (or to) data/scannet/scans/ (Please follow the ScanNet Instructions for downloading the ScanNet dataset). After this step, there should be folders containing the ScanNet scene data under the data/scannet/scans/ with names like scene0000_00
  4. Used official and pre-trained PointGroup generate panoptic segmentation in PointGroupInst/. We provide pre-processed data in Baidu Netdisk [password: 0nxc].
  5. Pre-processed instance labels, and new data should be generated in data/scannet/pointgroup_data/
cd data/scannet/
python prepare_data.py --split train --pointgroupinst_path [YOUR_PATH]
python prepare_data.py --split val   --pointgroupinst_path [YOUR_PATH]
python prepare_data.py --split test  --pointgroupinst_path [YOUR_PATH]

Finally, the dataset folder should be organized as follows.

InstanceRefer
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ scannet
β”‚   β”‚  β”œβ”€β”€ meta_data
β”‚   β”‚  β”œβ”€β”€ pointgroup_data
β”‚   β”‚  β”‚  β”œβ”€β”€ scene0000_00_aligned_bbox.npy
β”‚   β”‚  β”‚  β”œβ”€β”€ scene0000_00_aligned_vert.npy
β”‚   β”‚  β”œβ”€β”€β”œβ”€β”€  ... ...

Training

Train the InstanceRefer model. You can change hyper-parameters in config/InstanceRefer.yaml:

python scripts/train.py --log_dir instancerefer

TODO

  • Updating to the best version.
  • Release codes for prediction on benchmark.
  • Release pre-trained model.
  • Merge PointGroup in an end-to-end manner.

Acknowledgement

This project is not possible without multiple great opensourced codebases.

License

This repository is released under MIT License (see LICENSE file for details).

HTSeq is a Python library to facilitate processing and analysis of data from high-throughput sequencing (HTS) experiments.

HTSeq DEVS: https://github.com/htseq/htseq DOCS: https://htseq.readthedocs.io A Python library to facilitate programmatic analysis of data from high-t

HTSeq 57 Dec 20, 2022
StableSims is an open-source project aimed at simulating MakerDAO's Dai stablecoin system

StableSims is an open-source project aimed at simulating MakerDAO's Dai stablecoin system, initially used for researching optimal incentive parameters for Liquidations 2.0.

Blockchain at Berkeley 52 Nov 21, 2022
This repo provides function call to track multi-objects in videos

Custom Object Tracking Introduction This repo provides function call to track multi-objects in videos with a given trained object detection model and

Jeff Lo 51 Nov 22, 2022
Save-restricted-v-3 - Save restricted content Bot For telegram

Save restricted content Bot Contact: Telegram A stable telegram bot to get restr

DEVANSH 11 Dec 21, 2022
The source code of "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022.

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation The source code of our work "SIDE: Center-based Stereo 3D Detecto

10 Dec 18, 2022
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

GLOM - Pytorch (wip) An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding,

Phil Wang 173 Dec 14, 2022
Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

Siddha Ganju 108 Dec 28, 2022
RefineGNN - Iterative refinement graph neural network for antibody sequence-structure co-design (RefineGNN)

Iterative refinement graph neural network for antibody sequence-structure co-des

Wengong Jin 83 Dec 31, 2022
Recognize numbers from an (28 x 28) image using neural networks

Number recognition Recognize numbers from a 28 x 28 image using neural networks Usage This is an example of a simple usage of number-recognition NOTE:

Mauro BaladΓ©s 2 Dec 29, 2021
Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization Improves Vision Transformers This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision T

Wenqi Shao 20 Oct 09, 2022
Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Unseen Object Amodal Instance Segmentation (UOAIS) Seunghyeok Back, Joosoon Lee, Taewon Kim, Sangjun Noh, Raeyoung Kang, Seongho Bak, Kyoobin Lee This

GIST-AILAB 92 Dec 13, 2022
Pyramid Scene Parsing Network, CVPR2017.

Pyramid Scene Parsing Network by Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, details are in project page. Introduction This

Hengshuang Zhao 1.5k Jan 05, 2023
Unofficial keras(tensorflow) implementation of MAE model from Masked Autoencoders Are Scalable Vision Learners

MAE-keras Unofficial keras(tensorflow) implementation of MAE model described in 'Masked Autoencoders Are Scalable Vision Learners'. This work has been

Yewon 11 Jun 12, 2022
Awesome Graph Classification - A collection of important graph embedding, classification and representation learning papers with implementations.

A collection of graph classification methods, covering embedding, deep learning, graph kernel and factorization papers

Benedek Rozemberczki 4.5k Jan 01, 2023
Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes (CVPR 2021) Project page | Paper | Colab | Colab for Drawing App Rethinking Style

CompVis Heidelberg 153 Jan 04, 2023
DeepSpamReview: Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures. Summer Internship project at CoreView Systems.

Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures Dataset: https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polar

Ashish Salunkhe 37 Dec 17, 2022
A hue shift helper for OBS

obs-hue-shift A hue shift helper for OBS This is a repo based on the really nice script Hegemege made. The original script can be found https://gist.g

Alexis Tyler 1 Jan 10, 2022
code for Grapadora research paper experimentation

Road feature embedding selection method Code for research paper experimentation Abstract Traffic forecasting models rely on data that needs to be sens

Eric LΓ³pez Manibardo 0 May 26, 2022
A map update dataset and benchmark

MUNO21 MUNO21 is a dataset and benchmark for machine learning methods that automatically update and maintain digital street map datasets. Previous dat

16 Nov 30, 2022
【CVPR 2021, Variational Inference Framework, PyTorch】 From Rain Generation to Rain Removal

From Rain Generation to Rain Removal (CVPR2021) Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, and Deyu Meng [PDF&&Supplementary Material]

Hong Wang 48 Nov 23, 2022