code for CVPR paper Zero-shot Instance Segmentation

Overview

Code for CVPR2021 paper

Zero-shot Instance Segmentation

Code requirements

  • python: python3.7
  • nvidia GPU
  • pytorch1.1.0
  • GCC >=5.4
  • NCCL 2
  • the other python libs in requirement.txt

Install

conda create -n zsi python=3.7 -y
conda activate zsi

conda install pytorch=1.1.0 torchvision=0.3.0 cudatoolkit=10.0 -c pytorch

pip install cython && pip --no-cache-dir install -r requirements.txt
   
python setup.py develop

Dataset prepare

  • Download the train and test annotations files for zsi from annotations, put all json label file to

    data/coco/annotations/
    
  • Download MSCOCO-2014 dataset and unzip the images it to path:

    data/coco/train2014/
    data/coco/val2014/
    
  • Training:

    • 48/17 split:

         chmod +x tools/dist_train.sh
         ./tools/dist_train.sh configs/zsi/train/zero-shot-mask-rcnn-BARPN-bbox_mask_sync_bg_decoder.py 4
      
    • 65/15 split:

      chmod +x tools/dist_train.sh
      ./tools/dist_train.sh configs/zsi/train/zero-shot-mask-rcnn-BARPN-bbox_mask_sync_bg_65_15_decoder_notanh.py 4
      
  • Inference & Evaluate:

    • ZSI task:

      • 48/17 split ZSI task:
        • download 48/17 ZSI model, put it in checkpoints/ZSI_48_17.pth

        • inference:

          chmod +x tools/dist_test.sh
          ./tools/dist_test.sh configs/zsi/48_17/test/zsi/zero-shot-mask-rcnn-BARPN-bbox_mask_sync_bg_decoder.py checkpoints/ZSI_48_17.pth 4 --json_out results/zsi_48_17.json
          
        • our results zsi_48_17.bbox.json and zsi_48_17.segm.json can also downloaded from zsi_48_17_reults.

        • evaluate:

          • for zsd performance
            python tools/zsi_coco_eval.py results/zsi_48_17.bbox.json --ann data/coco/annotations/instances_val2014_unseen_48_17.json
            
          • for zsi performance
            python tools/zsi_coco_eval.py results/zsi_48_17.segm.json --ann data/coco/annotations/instances_val2014_unseen_48_17.json --types segm
            
      • 65/15 split ZSI task:
        • download 65/15 ZSI model, put it in checkpoints/ZSI_65_15.pth

        • inference:

          chmod +x tools/dist_test.sh
          ./toools/dist_test.sh configs/zsi/65_15/test/zsi/zero-shot-mask-rcnn-BARPN-bbox_mask_sync_bg_65_15_decoder_notanh.py checkpoints/ZSI_65_15.pth 4 --json_out results/zsi_65_15.json
          
        • our results zsi_65_15.bbox.json and zsi_65_15.segm.json can also downloaded from zsi_65_15_reults.

        • evaluate:

          • for zsd performance
            python tools/zsi_coco_eval.py results/zsi_65_15.bbox.json --ann data/coco/annotations/instances_val2014_unseen_65_15.json
            
          • for zsi performance
            python tools/zsi_coco_eval.py results/zsi_65_15.segm.json --ann data/coco/annotations/instances_val2014_unseen_65_15.json --types segm
            
    • GZSI task:

      • 48/17 split GZSI task:
        • use the same model file ZSI_48_17.pth in ZSI task
        • inference:
          chmod +x tools/dist_test.sh
          ./tools/dist_test.sh configs/zsi/48_17/test/gzsi/zero-shot-mask-rcnn-BARPN-bbox_mask_sync_bg_decoder_gzsi.py checkpoints/ZSI_48_17.pth 4 --json_out results/gzsi_48_17.json
          
        • our results gzsi_48_17.bbox.json and gzsi_48_17.segm.json can also downloaded from gzsi_48_17_results.
        • evaluate:
          • for gzsd
            python tools/gzsi_coco_eval.py results/gzsi_48_17.bbox.json --ann data/coco/annotations/instances_val2014_gzsi_48_17.json --gzsi --num-seen-classes 48
            
          • for gzsi
            python tools/gzsi_coco_eval.py results/gzsi_48_17.segm.json --ann data/coco/annotations/instances_val2014_gzsi_48_17.json --gzsi --num-seen-classes 48 --types segm
            
      • 65/15 split GZSI task:
        • use the same model file ZSI_48_17.pth in ZSI task
        • inference:
          chmod +x tools/dist_test.sh
          ./tools/dist_test.sh configs/zsi/65_15/test/gzsi/zero-shot-mask-rcnn-BARPN-bbox_mask_sync_bg_65_15_decoder_notanh_gzsi.py checkpoints/ZSI_65_15.pth 4 --json_out results/gzsi_65_15.json
          
        • our results gzsi_65_15.bbox.json and gzsi_65_15.segm.json can also downloaded from gzsi_65_15_results.
        • evaluate:
          • for gzsd
            python tools/gzsi_coco_eval.py results/gzsi_65_15.bbox.json --ann data/coco/annotations/instances_val2014_gzsi_65_15.json --gzsd --num-seen-classes 65
            
          • for gzsi
            python tools/gzsi_coco_eval.py results/gzsi_65_15.segm.json --ann data/coco/annotations/instances_val2014_gzsi_65_15.json --gzsd --num-seen-classes 65 --types segm
            

License

ZSI is released under MIT License.

Citing

If you use ZSI in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@InProceedings{zhengye2021zsi,
  author  =  {Ye, Zheng and Jiahong, Wu and Yongqiag, Qin and Faen, Zhang and Li, Cui},
  title   =  {Zero-shot Instance Segmentation},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2021}
}
Owner
zhengye
CS Phd
zhengye
Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Improving-Adversarial-Transferability-of-Vision-Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Khan, Fatih Porikli arxiv link A

Muzammal Naseer 47 Dec 02, 2022
Code for our ACL 2021 paper "One2Set: Generating Diverse Keyphrases as a Set"

One2Set This repository contains the code for our ACL 2021 paper “One2Set: Generating Diverse Keyphrases as a Set”. Our implementation is built on the

Jiacheng Ye 63 Jan 05, 2023
A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud.

Lidar with Velocity A robust camera and Lidar fusion based velocity estimator to undistort the pointcloud. related paper: Lidar with Velocity : Motion

ISEE Research Group 164 Dec 30, 2022
This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

75 Dec 02, 2022
The code written during my Bachelor Thesis "Classification of Human Whole-Body Motion using Hidden Markov Models".

This code was written during the course of my Bachelor thesis Classification of Human Whole-Body Motion using Hidden Markov Models. Some things might

Matthias Plappert 14 Dec 06, 2022
Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021) Jiaxi Jiang, Kai Zhang, Radu Timofte Computer Vision Lab, ETH Zurich, Switzerland 🔥

Jiaxi Jiang 282 Jan 02, 2023
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

ROCKET + MINIROCKET ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge D

298 Dec 26, 2022
Python script to download the celebA-HQ dataset from google drive

download-celebA-HQ Python script to download and create the celebA-HQ dataset. WARNING from the author. I believe this script is broken since a few mo

133 Dec 21, 2022
Implement object segmentation on images using HOG algorithm proposed in CVPR 2005

HOG Algorithm Implementation Description HOG (Histograms of Oriented Gradients) Algorithm is an algorithm aiming to realize object segmentation (edge

Leo Hsieh 2 Mar 12, 2022
An off-line judger supporting distributed problem repositories

Thaw 中文 | English Thaw is an off-line judger supporting distributed problem repositories. Everyone can use Thaw release problems with license on GitHu

countercurrent_time 2 Jan 09, 2022
The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.

Human Trajectory Prediction via Counterfactual Analysis (CausalHTP) The official PyTorch code implementation of "Human Trajectory Prediction via Count

46 Dec 03, 2022
Simple image captioning model - CLIP prefix captioning.

Simple image captioning model - CLIP prefix captioning.

688 Jan 04, 2023
Website which uses Deep Learning to generate horror stories.

Creepypasta - Text Generator Website which uses Deep Learning to generate horror stories. View Demo · View Website Repo · Report Bug · Request Feature

Dhairya Sharma 5 Oct 14, 2022
A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion This repo intends to release code for our work: Zhaoyang Lyu*, Zhifeng

Zhaoyang Lyu 68 Jan 03, 2023
Evaluation Pipeline for our ECCV2020: Journey Towards Tiny Perceptual Super-Resolution.

Journey Towards Tiny Perceptual Super-Resolution Test code for our ECCV2020 paper: https://arxiv.org/abs/2007.04356 Our x4 upscaling pre-trained model

Royson 6 Mar 30, 2022
Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

SuperGAT Official implementation of Self-supervised Graph Attention Networks (SuperGAT). This model is presented at How to Find Your Friendly Neighbor

Dongkwan Kim 127 Dec 28, 2022
DLWP: Deep Learning Weather Prediction

DLWP: Deep Learning Weather Prediction DLWP is a Python project containing data-

Kushal Shingote 3 Aug 14, 2022
Code for the paper: Sketch Your Own GAN

Sketch Your Own GAN Project | Paper | Youtube Our method takes in one or a few hand-drawn sketches and customizes an off-the-shelf GAN to match the in

677 Dec 28, 2022
Doing the asl sign language classification on static images using graph neural networks.

SignLangGNN When GNNs 💜 MediaPipe. This is a starter project where I tried to implement some traditional image classification problem i.e. the ASL si

10 Nov 09, 2022
PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations.

HPNet This repository contains the PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations. Installation The

Siming Yan 42 Dec 07, 2022