ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Last update: Dec 19, 2022

Related tags

Overview

This is the project page for the paper:

ISTR: End-to-End Instance Segmentation via Transformers,
Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wang, Ke Li, Feiyue Huang, Ling Shao, Rongrong Ji,
arXiv 2105.00637

⭐ Highlights:

GPU Friendly: Four 1080Ti/2080Ti GPUs can handle the training for R50, R101 backbones with ISTR.
High Performance: On COCO test-dev, ISTR-R50-3x gets 46.8/38.6 box/mask AP, and ISTR-R101-3x gets 48.1/39.9 box/mask AP.

Updates

(2021.05.03) The project page for ISTR is avaliable.

Models

Method	inf. time	box AP	mask AP	download
ISTR-R50-3x	17.8 FPS	46.8	38.6	model \| log
ISTR-R101-3x	13.9 FPS	48.1	39.9	model \| log

The inference time is evaluated with a single 2080Ti GPU.
We use the models pre-trained on ImageNet using torchvision. The ImageNet pre-trained ResNet-101 backbone is obtained from SparseR-CNN.

Installation

The codes are built on top of Detectron2, SparseR-CNN, and AdelaiDet.

Requirements

Python=3.8
PyTorch=1.6.0, torchvision=0.7.0, cudatoolkit=10.1
OpenCV for visualization

Steps

Install the repository (we recommend to use Anaconda for installation.)

conda create -n ISTR python=3.8 -y
conda activate ISTR
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
pip install opencv-python
pip install scipy
pip install shapely
git clone https://github.com/hujiecpp/ISTR.git
cd ISTR
python setup.py build develop

Link coco dataset path

ln -s /coco_dataset_path/coco ./datasets

Train ISTR (e.g., with ResNet50 backbone)

python projects/ISTR/train_net.py --num-gpus 4 --config-file projects/ISTR/configs/ISTR-R50-3x.yaml

Evaluate ISTR (e.g., with ResNet50 backbone)

python projects/ISTR/train_net.py --num-gpus 4 --config-file projects/ISTR/configs/ISTR-R50-3x.yaml --eval-only MODEL.WEIGHTS ./output/model_final.pth

Visualize the detection and segmentation results (e.g., with ResNet50 backbone)

python demo/demo.py --config-file projects/ISTR/configs/ISTR-R50-3x.yaml --input input1.jpg --output ./output --confidence-threshold 0.4 --opts MODEL.WEIGHTS ./output/model_final.pth

Citation

If our paper helps your research, please cite it in your publications:

@article{hu2021ISTR,
  title={ISTR: End-to-End Instance Segmentation via Transformers},
  author={Hu, Jie and Cao, Liujuan and Lu, Yao and Zhang, ShengChuan and Li, Ke and Huang, Feiyue and Shao, Ling and Ji, Rongrong},
  journal={arXiv preprint arXiv:2105.00637},
  year={2021}
}

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Related tags

Overview

Updates

Models

Installation

Requirements

Steps

Citation

Owner

Jie Hu

Style-based Neural Drum Synthesis with GAN inversion

This is an open source library implementing hyperbox-based machine learning algorithms

A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Türkiye Canlı Mobese Görüntülerinde Profesyonel Nesne Takip Sistemi

Python and Julia in harmony.

GPU-accelerated Image Processing library using OpenCL

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process

Grammar Induction using a Template Tree Approach

Source code, datasets and trained models for the paper Learning Advanced Mathematical Computations from Examples (ICLR 2021), by François Charton, Amaury Hayat (ENPC-Rutgers) and Guillaume Lample

Differentiable architecture search for convolutional and recurrent networks

Framework for training options with different attention mechanism and using them to solve downstream tasks.

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Automatic differentiation with weighted finite-state transducers.

My freqtrade strategies

Revisting Open World Object Detection

Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

Object Detection and Multi-Object Tracking

Python PID Tuner - Based on a FOPDT model obtained using a Open Loop Process Reaction Curve

This repository contains numerical implementation for the paper Intertemporal Pricing under Reference Effects: Integrating Reference Effects and Consumer Heterogeneity.

PROJECT - Az Residential Real Estate Analysis