PyTorch Implementation of Region Similarity Representation Learning (ReSim)

Last update: Jan 03, 2023

Related tags

Overview

ReSim

This repository provides the PyTorch implementation of Region Similarity Representation Learning (ReSim) described in this paper:

@Article{xiao2021region,
  author  = {Tete Xiao and Colorado J Reed and Xiaolong Wang and Kurt Keutzer and Trevor Darrell},
  title   = {Region Similarity Representation Learning},
  journal = {arXiv preprint arXiv:2103.12902},
  year    = {2021},
}

tldr; ReSim maintains spatial relationships in the convolutional feature maps when performing instance contrastive pre-training, which is useful for region-related tasks such as object detection, segmentation, and dense pose estimation.

Installation

Assuming a conda environment:

conda create --name resim python=3.7
conda activate resim

# NOTE: if you are not using CUDA 10.2, you need to change the 10.2 in this command appropriately. 
# Code tested with torch 1.6 and 1.7
# (check CUDA version with e.g. `cat /usr/local/cuda/version.txt`)
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch

Pre-training

This codebase is based on the original MoCo codebase -- see this README for more details.

To pre-train for 200 epochs using the ReSim-FPN implementation as described in the paper:

python main_moco.py -a resnet50 --lr 0.03 --batch-size 256 \
       --dist-url tcp://localhost:10005 --multiprocessing-distributed --world-size 1 --rank 0 \
       --mlp --moco-t 0.2 --aug-plus --cos --epochs 200 \
       /location/of/imagenet/data/folder

ResNet-50 Pre-trained Models

Checkpoint	Pre-train Epochs	COCO AP @2x	MoCo Checkpoint	Detectron Backbone
ReSim-FPN	400	41.9	Download	Download
ReSim-FPN	200	41.4	Download	Download
ReSim-C4	200	41.1	Download	Download

Detection

See these instructions for more details, but in brief:

# first install detectron2
# then place COCO-2017 dataset detection/datasets/coco

cd detection
python convert-pretrain-to-detectron2.py ../resim_fpn_checkpoint_latest.pth.tar detectron_resim_fpn_checkpoint_latest.pth.tar
python train_net.py --dist-url 'tcp://127.0.0.1:17654' --config-file configs/coco_R_50_FPN_2x_moco.yaml --num-gpus 8 MODEL.WEIGHTS detectron_resim_fpn_checkpoint_latest.pth.tar TEST.EVAL_PERIOD 180000 OUTPUT_DIR results/coco2x-resim-fpn SOLVER.CHECKPOINT_PERIOD 180000

License

This project is under the CC-BY-NC 4.0 license. See LICENSE.

PyTorch Implementation of Region Similarity Representation Learning (ReSim)

Related tags

Overview

ReSim

Installation

Pre-training

ResNet-50 Pre-trained Models

Detection

License

Owner

Tete Xiao

Job Assignment System by Real-time Emotion Detection

Code of Puregaze: Purifying gaze feature for generalizable gaze estimation, AAAI 2022.

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

Construct a neural network frame by Numpy

[CVPR 2021] Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF shows significant improvements over baseline fine-tuning without data filtration.

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

Quantized tflite models for ailia TFLite Runtime

Code Repository for Liquid Time-Constant Networks (LTCs)

Finding Donors for CharityML

🥈78th place in Riiid Answer Correctness Prediction competition

Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

E2VID_ROS - E2VID_ROS: E2VID to a real-time system

M2MRF: Many-to-Many Reassembly of Features for Tiny Lesion Segmentation in Fundus Images

Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

🧮 Matrix Factorization for Collaborative Filtering is just Solving an Adjoint Latent Dirichlet Allocation Model after All

Facial Image Inpainting with Semantic Control

MoCoGAN: Decomposing Motion and Content for Video Generation