ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Last update: Dec 19, 2022

Related tags

Overview

This is the project page for the paper:

ISTR: End-to-End Instance Segmentation via Transformers,
Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wang, Ke Li, Feiyue Huang, Ling Shao, Rongrong Ji,
arXiv 2105.00637

⭐ Highlights:

GPU Friendly: Four 1080Ti/2080Ti GPUs can handle the training for R50, R101 backbones with ISTR.
High Performance: On COCO test-dev, ISTR-R50-3x gets 46.8/38.6 box/mask AP, and ISTR-R101-3x gets 48.1/39.9 box/mask AP.

Updates

(2021.05.03) The project page for ISTR is avaliable.

Models

Method	inf. time	box AP	mask AP	download
ISTR-R50-3x	17.8 FPS	46.8	38.6	model \| log
ISTR-R101-3x	13.9 FPS	48.1	39.9	model \| log

The inference time is evaluated with a single 2080Ti GPU.
We use the models pre-trained on ImageNet using torchvision. The ImageNet pre-trained ResNet-101 backbone is obtained from SparseR-CNN.

Installation

The codes are built on top of Detectron2, SparseR-CNN, and AdelaiDet.

Requirements

Python=3.8
PyTorch=1.6.0, torchvision=0.7.0, cudatoolkit=10.1
OpenCV for visualization

Steps

Install the repository (we recommend to use Anaconda for installation.)

conda create -n ISTR python=3.8 -y
conda activate ISTR
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
pip install opencv-python
pip install scipy
pip install shapely
git clone https://github.com/hujiecpp/ISTR.git
cd ISTR
python setup.py build develop

Link coco dataset path

ln -s /coco_dataset_path/coco ./datasets

Train ISTR (e.g., with ResNet50 backbone)

python projects/ISTR/train_net.py --num-gpus 4 --config-file projects/ISTR/configs/ISTR-R50-3x.yaml

Evaluate ISTR (e.g., with ResNet50 backbone)

python projects/ISTR/train_net.py --num-gpus 4 --config-file projects/ISTR/configs/ISTR-R50-3x.yaml --eval-only MODEL.WEIGHTS ./output/model_final.pth

Visualize the detection and segmentation results (e.g., with ResNet50 backbone)

python demo/demo.py --config-file projects/ISTR/configs/ISTR-R50-3x.yaml --input input1.jpg --output ./output --confidence-threshold 0.4 --opts MODEL.WEIGHTS ./output/model_final.pth

Citation

If our paper helps your research, please cite it in your publications:

@article{hu2021ISTR,
  title={ISTR: End-to-End Instance Segmentation via Transformers},
  author={Hu, Jie and Cao, Liujuan and Lu, Yao and Zhang, ShengChuan and Li, Ke and Huang, Feiyue and Shao, Ling and Ji, Rongrong},
  journal={arXiv preprint arXiv:2105.00637},
  year={2021}
}

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Related tags

Overview

Updates

Models

Installation

Requirements

Steps

Citation

Owner

Jie Hu

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

QMagFace: Simple and Accurate Quality-Aware Face Recognition

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

PyTorch implementation of the implicit Q-learning algorithm (IQL)

[ArXiv 2021] One-Shot Generative Domain Adaptation

Convert Mission Planner (ArduCopter) Waypoint Missions to Litchi CSV Format to execute on DJI Drones

Neural Turing Machine (NTM) & Differentiable Neural Computer (DNC) with pytorch & visdom

Hyper-parameter optimization for sklearn

FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction

[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

A web application that provides real time temperature and humidity readings of a house.

Multitask Learning Strengthens Adversarial Robustness

A tool for making map images from OpenTTD save games

Resilience from Diversity: Population-based approach to harden models against adversarial attacks

tsflex - feature-extraction benchmarking

Multilingual Image Captioning

PyTorch implementation of "Representing Shape Collections with Alignment-Aware Linear Models" paper.

Official Implementation of VAT

Use .csv files to record, play and evaluate motion capture data.