RoIAlign & crop_and_resize for PyTorch

Related tags

Deep Learningpytorch
Overview

RoIAlign for PyTorch

This is a PyTorch version of RoIAlign. This implementation is based on crop_and_resize and supports both forward and backward on CPU and GPU.

NOTE: Thanks meikuam for updating this repo for PyTorch 1.0. You can find the original version for torch <= 0.4.1 in pytorch_0.4 branch.

Introduction

The crop_and_resize function is ported from tensorflow, and has the same interface with tensorflow version, except the input feature map should be in NCHW order in PyTorch. They also have the same output value (error < 1e-5) for both forward and backward as we expected, see the comparision in test.py.

Note: Document of crop_and_resize can be found here. And RoIAlign is a wrap of crop_and_resize that uses boxes with unnormalized (x1, y1, x2, y2) as input (while crop_and_resize use normalized (y1, x1, y2, x2) as input). See more details about the difference of RoIAlign and crop_and_resize in tensorpack.

Warning: Currently it only works using the default GPU (index 0)

Usage

  • Install and test

    python setup.py install
    ./test.sh
    
  • Use RoIAlign or crop_and_resize

    Since PyTorch 1.2.0 Legacy autograd function with non-static forward method is deprecated. We use new-style autograd function with static forward method. Example:

    import torch
    from roi_align import RoIAlign      # RoIAlign module
    from roi_align import CropAndResize # crop_and_resize module
    
    # input feature maps (suppose that we have batch_size==2)
    image = torch.arange(0., 49).view(1, 1, 7, 7).repeat(2, 1, 1, 1)
    image[0] += 10
    print('image: ', image)
    
    
    # for example, we have two bboxes with coords xyxy (first with batch_id=0, second with batch_id=1).
    boxes = torch.Tensor([[1, 0, 5, 4],
                         [0.5, 3.5, 4, 7]])
    
    box_index = torch.tensor([0, 1], dtype=torch.int) # index of bbox in batch
    
    # RoIAlign layer with crop sizes:
    crop_height = 4
    crop_width = 4
    roi_align = RoIAlign(crop_height, crop_width)
    
    # make crops:
    crops = roi_align(image, boxes, box_index)
    
    print('crops:', crops)

    Output:

    image:  tensor([[[[10., 11., 12., 13., 14., 15., 16.],
          [17., 18., 19., 20., 21., 22., 23.],
          [24., 25., 26., 27., 28., 29., 30.],
          [31., 32., 33., 34., 35., 36., 37.],
          [38., 39., 40., 41., 42., 43., 44.],
          [45., 46., 47., 48., 49., 50., 51.],
          [52., 53., 54., 55., 56., 57., 58.]]],
    
    
        [[[ 0.,  1.,  2.,  3.,  4.,  5.,  6.],
          [ 7.,  8.,  9., 10., 11., 12., 13.],
          [14., 15., 16., 17., 18., 19., 20.],
          [21., 22., 23., 24., 25., 26., 27.],
          [28., 29., 30., 31., 32., 33., 34.],
          [35., 36., 37., 38., 39., 40., 41.],
          [42., 43., 44., 45., 46., 47., 48.]]]])
          
    crops: tensor([[[[11.0000, 12.0000, 13.0000, 14.0000],
              [18.0000, 19.0000, 20.0000, 21.0000],
              [25.0000, 26.0000, 27.0000, 28.0000],
              [32.0000, 33.0000, 34.0000, 35.0000]]],
    
    
            [[[24.5000, 25.3750, 26.2500, 27.1250],
              [30.6250, 31.5000, 32.3750, 33.2500],
              [36.7500, 37.6250, 38.5000, 39.3750],
              [ 0.0000,  0.0000,  0.0000,  0.0000]]]])
Owner
Long Chen
Computer Vision
Long Chen
RoMA: Robust Model Adaptation for Offline Model-based Optimization

RoMA: Robust Model Adaptation for Offline Model-based Optimization Implementation of RoMA: Robust Model Adaptation for Offline Model-based Optimizatio

9 Oct 31, 2022
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 09, 2023
SpineAI Bilsky Grading With Python

SpineAI-Bilsky-Grading SpineAI Paper with Code 📫 Contact Address correspondence to J.T.P.D.H. (e-mail: james_hallinan AT nuhs.edu.sg) Disclaimer This

<a href=[email protected]"> 2 Dec 16, 2021
A python code to convert Keras pre-trained weights to Pytorch version

Weights_Keras_2_Pytorch 最近想在Pytorch项目里使用一下谷歌的NIMA,但是发现没有预训练好的pytorch权重,于是整理了一下将Keras预训练权重转为Pytorch的代码,目前是支持Keras的Conv2D, Dense, DepthwiseConv2D, Batch

Liu Hengyu 2 Dec 16, 2021
MAT: Mask-Aware Transformer for Large Hole Image Inpainting

MAT: Mask-Aware Transformer for Large Hole Image Inpainting (CVPR2022, Oral) Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, Jiaya Jia [Paper] News This

254 Dec 29, 2022
Commonsense Ability Tests

CATS Commonsense Ability Tests Dataset and script for paper Evaluating Commonsense in Pre-trained Language Models Use making_sense.py to run the exper

XUHUI ZHOU 28 Oct 19, 2022
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation This the repository for this paper. Find extensions of this w

Zhuoyuan Mao 14 Oct 26, 2022
Contrastive Learning of Structured World Models

Contrastive Learning of Structured World Models This repository contains the official PyTorch implementation of: Contrastive Learning of Structured Wo

Thomas Kipf 371 Jan 06, 2023
New AidForBlind - Various Libraries used like OpenCV and other mentioned in Requirements.txt

AidForBlind Recommended PyCharm IDE Various Libraries used like OpenCV and other

Aalhad Chandewar 1 Jan 13, 2022
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Richard Wang 443 Dec 06, 2022
MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Octave Convolution MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution Imag

Meta Research 549 Dec 28, 2022
Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation Paper Multi-Target Adversarial Frameworks for Domain Adaptation in

Valeo.ai 20 Jun 21, 2022
Code for paper "Context-self contrastive pretraining for crop type semantic segmentation"

Code for paper "Context-self contrastive pretraining for crop type semantic segmentation" Setting up a python environment Follow the instruction in ht

Michael Tarasiou 11 Oct 09, 2022
Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized

VQGAN-CLIP-Docker About Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized This is a stripped and minimal dependency repository for running loca

Kevin Costa 73 Sep 11, 2022
Implementation of SSMF: Shifting Seasonal Matrix Factorization

SSMF Implementation of SSMF: Shifting Seasonal Matrix Factorization, Koki Kawabata, Siddharth Bhatia, Rui Liu, Mohit Wadhwa, Bryan Hooi. NeurIPS, 2021

Koki Kawabata 9 Jun 10, 2022
Spatial-Location-Constraint-Prototype-Loss-for-Open-Set-Recognition

Spatial Location Constraint Prototype Loss for Open Set Recognition Official PyTorch implementation of "Spatial Location Constraint Prototype Loss for

Xia Ziheng 12 Jun 24, 2022
Multimodal commodity image retrieval 多模态商品图像检索

Multimodal commodity image retrieval 多模态商品图像检索 Not finished yet... introduce explain:The specific description of the project and the product image dat

hongjie 8 Nov 25, 2022
DeLag: Detecting Latency Degradation Patterns in Service-based Systems

DeLag: Detecting Latency Degradation Patterns in Service-based Systems Replication package of the work "DeLag: Detecting Latency Degradation Patterns

SEALABQualityGroup @ University of L'Aquila 2 Mar 24, 2022
Underwater image enhancement

LANet Our work proposes an adaptive learning attention network (LANet) to solve the problem of color casts and low illumination in underwater images.

LiuShiBen 7 Sep 14, 2022
[TIP 2021] SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction

SADRNet Paper link: SADRNet: Self-Aligned Dual Face Regression Networks for Robust 3D Dense Face Alignment and Reconstruction Requirements python

Multimedia Computing Group, Nanjing University 99 Dec 30, 2022