Code for Learning to Segment The Tail (LST)

Related tags

Deep LearningLST_LVIS
Overview

Learning to Segment the Tail

[arXiv]


In this repository, we release code for Learning to Segment The Tail (LST). The code is directly modified from the project maskrcnn_benchmark, which is an excellent codebase! If you get any problem that causes you unable to run the project, you can check the issues under maskrcnn_benchmark first.

Installation

Please following INSTALL.md for maskrcnn_benchmark. For experiments on LVIS_v0.5 dataset, you need to use lvis-api.

LVIS Dataset

After downloading LVIS_v0.5 dataset (the images are the same as COCO 2017 version), we recommend to symlink the path to the lvis dataset to datasets/ as follows

# symlink the lvis dataset
cd ~/github/LST_LVIS
mkdir -p datasets/lvis
ln -s /path_to_lvis_dataset/annotations datasets/lvis/annotations
ln -s /path_to_coco_dataset/images datasets/lvis/images

A detailed visualization demo for LVIS is LVIS_visualization. You'll find it is the most useful thing you can get from this repo :P

Dataset Pre-processing and Indices Generation

dataset_preprocess.ipynb: LVIS dataset is split into the base set and sets for the incremental phases.

balanced_replay.ipynb: We generate indices to load the LVIS dataset offline using the balanced replay scheme discussed in our paper.

Training

Our pre-trained model is model. You can trim the model and load it for LVIS training as in trim_model. Modifications to the backbone follows MaskX R-CNN. You can also check our paper for detail.

training for base

The base training is the same as conventional training. For example, to train a model with 8 GPUs you can run:

python -m torch.distributed.launch --nproc_per_node=8 /path_to_maskrcnn_benchmark/tools/train_net.py --use-tensorboard --config-file "/path/to/config/train_file.yaml"  MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 1000

The details about MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN is discussed in maskrcnn-benchmark.

Edit this line to initialze the dataloader with corresponding sorted category ids.

training for incremental steps

The training for each incremental phase is armed with our data balanced replay. It needs to be initialized properly here, providing the corresponding external img-id/cls-id pairs for data-loading.

get distillation

We use ground truth bounding boxes to get prediction logits using the model trained from last step. Change this to decide which classes to be distilled.

Here is an example for running:

python ./tools/train_net.py --use-tensorboard --config-file "/path/to/config/get_distillation_file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 1000

The output distillation logits are saved in json format.

Evaluation

The evaluation for LVIS is a little bit different from COCO since it is not exhausted annotated, which is discussed in detail in Gupta et al.'s work.

We also report the AP for each phase and each class, which can provide better analysis.

You can run:

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/test_net.py --config-file "/path/to/config/train_file.yaml" 

We also provide periodically testing to check the result better, as discussed in this issue.

Thanks for all the previous work and the sharing of their codes. Sorry for my ugly code and I appreciate your advice.

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{you2019torchcv, author = {Ansheng You and Xiangtai Li and Zhen Zhu a

Donny You 2.2k Jan 06, 2023
Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch

CutPaste CutPaste: image from paper Unofficial implementation of Google's "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization"

Lilit Yolyan 59 Nov 27, 2022
This is the official code release for the paper Shape and Material Capture at Home

This is the official code release for the paper Shape and Material Capture at Home. The code enables you to reconstruct a 3D mesh and Cook-Torrance BRDF from one or more images captured with a flashl

89 Dec 10, 2022
Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

DAL This project hosts the official implementation for our AAAI 2021 paper: Dynamic Anchor Learning for Arbitrary-Oriented Object Detection [arxiv] [c

ming71 215 Nov 28, 2022
Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation".

FPS-Net Code for "FPS-Net: A convolutional fusion network for large-scale LiDAR point cloud segmentation", accepted by ISPRS journal of Photogrammetry

15 Nov 30, 2022
Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper

Continual Learning With Filter Atom Swapping Pytorch Implementation of Continual Learning With Filter Atom Swapping (ICLR'22 Spolight) Paper If find t

11 Aug 29, 2022
Real-time pose estimation accelerated with NVIDIA TensorRT

trt_pose Want to detect hand poses? Check out the new trt_pose_hand project for real-time hand pose and gesture recognition! trt_pose is aimed at enab

NVIDIA AI IOT 803 Jan 06, 2023
Official code for "Stereo Waterdrop Removal with Row-wise Dilated Attention (IROS2021)"

Stereo-Waterdrop-Removal-with-Row-wise-Dilated-Attention This repository includes official codes for "Stereo Waterdrop Removal with Row-wise Dilated A

29 Oct 01, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

DV Lab 137 Dec 14, 2022
Multispectral Object Detection with Yolov5

Multispectral-Object-Detection Intro Official Code for Cross-Modality Fusion Transformer for Multispectral Object Detection. Multispectral Object Dete

Richard Fang 121 Jan 01, 2023
Cards Against Humanity AI

cah-ai This is a Cards Against Humanity AI implemented using a pre-trained Semantic Search model. How it works A player is described by a combination

Alex Nichol 2 Aug 22, 2022
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

419 Jan 03, 2023
Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Real world Anomaly Detection in Surveillance Videos : Pytorch RE-Implementation This repository is a re-implementation of "Real-world Anomaly Detectio

seominseok 62 Dec 08, 2022
Look Who’s Talking: Active Speaker Detection in the Wild

Look Who's Talking: Active Speaker Detection in the Wild Dependencies pip install -r requirements.txt In addition to the Python dependencies, ffmpeg

Clova AI Research 60 Dec 08, 2022
A PyTorch implementation of the continual learning experiments with deep neural networks

Brain-Inspired Replay A PyTorch implementation of the continual learning experiments with deep neural networks described in the following paper: Brain

182 Dec 27, 2022
A trashy useless Latin programming language written in python.

Codigum! The first programming langage in latin! (please keep your eyes closed when if you read the source code) It is pretty useless though. Document

Bic 2 Oct 25, 2021
A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population

DeepKE is a knowledge extraction toolkit supporting low-resource and document-level scenarios for entity, relation and attribute extraction. We provide comprehensive documents, Google Colab tutorials

ZJUNLP 1.6k Jan 05, 2023
Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021 Global Pooling, More than Meets the Eye: Posi

Md Amirul Islam 32 Apr 24, 2022
Teaching end to end workflow of deep learning

Deep-Education This repository is now available for public use for teaching end to end workflow of deep learning. This implies that learners/researche

Data Lab at College of William and Mary 2 Sep 26, 2022
[TIP 2020] Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion

Multi-Temporal Scene Classification and Scene Change Detection with Correlation based Fusion Code for Multi-Temporal Scene Classification and Scene Ch

Lixiang Ru 33 Dec 12, 2022