Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Last update: Dec 24, 2022

Overview

Deformable Attention

Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DETR. The relative positional embedding has also been modified for better extrapolation, using the Continuous Positional Embedding proposed in SwinV2.

Install

$ pip install deformable-attention

Usage

import torch
from deformable_attention import DeformableAttention

attn = DeformableAttention(
    dim = 512,                   # feature dimensions
    dim_head = 64,               # dimension per head
    heads = 8,                   # attention heads
    dropout = 0.,                # dropout
    downsample_factor = 4,       # downsample factor (r in paper)
    offset_scale = 4,            # scale of offset, maximum offset
    offset_groups = None,        # number of offset groups, should be multiple of heads
    offset_kernel_size = 6,      # offset kernel size
)

x = torch.randn(1, 512, 64, 64)
attn(x) # (1, 512, 64, 64)

3d deformable attention

import torch
from deformable_attention import DeformableAttention3D

attn = DeformableAttention3D(
    dim = 512,                          # feature dimensions
    dim_head = 64,                      # dimension per head
    heads = 8,                          # attention heads
    dropout = 0.,                       # dropout
    downsample_factor = (2, 8, 8),      # downsample factor (r in paper)
    offset_scale = (2, 8, 8),           # scale of offset, maximum offset
    offset_kernel_size = (4, 10, 10),   # offset kernel size
)

x = torch.randn(1, 512, 10, 32, 32) # (batch, dimension, frames, height, width)
attn(x) # (1, 512, 10, 32, 32)

1d deformable attention for good measure

import torch
from deformable_attention import DeformableAttention1D

attn = DeformableAttention1D(
    dim = 128,
    downsample_factor = 4,
    offset_scale = 2,
    offset_kernel_size = 6
)

x = torch.randn(1, 128, 512)
attn(x) # (1, 128, 512)

Citation

@misc{xia2022vision,
    title   = {Vision Transformer with Deformable Attention}, 
    author  = {Zhuofan Xia and Xuran Pan and Shiji Song and Li Erran Li and Gao Huang},
    year    = {2022},
    eprint  = {2201.00520},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

@misc{liu2021swin,
    title   = {Swin Transformer V2: Scaling Up Capacity and Resolution},
    author  = {Ze Liu and Han Hu and Yutong Lin and Zhuliang Yao and Zhenda Xie and Yixuan Wei and Jia Ning and Yue Cao and Zheng Zhang and Li Dong and Furu Wei and Baining Guo},
    year    = {2021},
    eprint  = {2111.09883},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

Deformable 3D Convolution for Video Super-Resolution Pytorch implementation of l

28 Dec 15, 2022

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

3DDUNET This is the code for 3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021) Conference Paper Link Dataset We use SMOID dataset

1 Jan 7, 2022

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

189 Nov 22, 2022

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

180 Jan 5, 2023

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

272 Dec 23, 2022

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

43 Dec 7, 2022

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

TimeSformer This is an official pytorch implementation of Is Space-Time Attention All You Need for Video Understanding?. In this repository, we provid

1k Dec 31, 2022

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

84 Jan 4, 2023

Comments

The relationship between 'dim' and 'inner_dim'

Hi, I have a question about DeformableAttention module,

I calculated the output volumes of the forward processes step by step, According to my calculation, the code works only when 'dim' and 'inner_dim' is same.

Is there any reason why you implement it this way?

Best regards, Hankyu

opened by hanq0212 4
TypeError: meshgrid() got an unexpected keyword argument 'indexing'

@lucidrains while trying to perform import torch from deformable_attention import DeformableAttention

attn = DeformableAttention( dim = 512, # feature dimensions dim_head = 64, # dimension per head heads = 8, # attention heads dropout = 0., # dropout downsample_factor = 4, # downsample factor (r in paper) offset_scale = 4, # scale of offset, maximum offset offset_groups = None, # number of offset groups, should be multiple of heads offset_kernel_size = 6, # offset kernel size )

x = torch.randn(1, 512, 64, 64) attn(x)

Got error below from the line.. Kindly help

https://github.com/lucidrains/deformable-attention/blob/9f3c0ae35652ce54687e0db409921018bfca3310/deformable_attention/deformable_attention_2d.py#L26

opened by ChidanandKumarKS 1

Releases(0.0.16)

0.0.16(Apr 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.15(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.14(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.12(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.11(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.10(Mar 24, 2022)

Source code(tar.gz)
Source code(zip)
0.0.9(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.8(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.7(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.6(Mar 21, 2022)

Source code(tar.gz)
Source code(zip)
0.0.5(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.4(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.3(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.2(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)
0.0.1(Mar 17, 2022)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention. It's all we need

GitHub Repository

This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.

Rotate-Yolov5 This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes. Section I. Description The codes are

90 Dec 13, 2022

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

AdaFocus (ICCV 2021) This repo contains the official code and pre-trained models for AdaFocus. Adaptive Focus for Efficient Video Recognition Referenc

115 Dec 21, 2022

The final project of "Applying AI to EHR Data" of "AI for Healthcare" nanodegree - Udacity.

Patient Selection for Diabetes Drug Testing Project Overview EHR data is becoming a key source of real-world evidence (RWE) for the pharmaceutical ind

1 Jan 14, 2022

Example for AUAV 2022 with obstacle avoidance.

AUAV 2022 Sample This is a sample PX4 based quadrotor path planning framework based on Ubuntu 20.04 and ROS noetic for the IEEE Autonomous UAS 2022 co

11 Sep 16, 2022

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

46 Sep 20, 2022

A simple image/video to Desmos graph converter run locally

Desmos Bezier Renderer A simple image/video to Desmos graph converter run locally Sample Result Setup Install dependencies apt update apt install git

339 Dec 23, 2022

AWS provides a Python SDK, "Boto3" ,which can be used to access the AWS-account from the local.

Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri

1 Oct 25, 2021

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research

Welcome to AirSim AirSim is a simulator for drones, cars and more, built on Unreal Engine (we now also have an experimental Unity release). It is open

13.8k Jan 03, 2023

Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021.

SphereRPN Code for the paper SphereRPN: Learning Spheres for High-Quality Region Proposals on 3D Point Clouds Object Detection, ICIP 2021. Authors: Th

15 Dec 02, 2022

Implementation for Learning to Track with Object Permanence

Learning to Track with Object Permanence A video-based MOT approach capable of tracking through full occlusions: Learning to Track with Object Permane

91 Jan 03, 2023

Semi-supervised learning for object detection

Source code for STAC: A Simple Semi-Supervised Learning Framework for Object Detection STAC is a simple yet effective SSL framework for visual object

348 Dec 25, 2022

Fully Convolutional Refined Auto Encoding Generative Adversarial Networks for 3D Multi Object Scenes

Fully Convolutional Refined Auto-Encoding Generative Adversarial Networks for 3D Multi Object Scenes This repository contains the source code for Full

106 Nov 21, 2022

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations

Seeing All the Angles: Learning Multiview Manipulation Policies for Contact-Rich Tasks from Demonstrations Trevor Ablett, Daniel (Yifan) Zhai, Jonatha

3 Feb 01, 2022

This project contains an implemented version of Face Detection using OpenCV and Mediapipe. This is a code snippet and can be used in projects.

Live-Face-Detection Project Description: In this project, we will be using the live video feed from the camera to detect Faces. It will also detect so

3 Oct 02, 2021

A modular application for performing anomaly detection in networks

Deep-Learning-Models-for-Network-Annomaly-Detection The modular app consists for mainly three annomaly detection algorithms. The system supports model

1 Dec 09, 2021

SEJE Pytorch implementation

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering. Contents Inst

0 Oct 21, 2021

A scanpy extension to analyse single-cell TCR and BCR data.

Scirpy: A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data Scirpy is a scalable python-toolkit to analyse T cell recept

145 Jan 03, 2023

Hand gesture recognition based whiteboard that allows you to write on live webcam. This is the first version and has features like 4 different colors, eraser and a recording option that records your session and saves it in a "recordings" folder. Use index finger to draw and two or more fingers to move around and select items. Future version will contain more functionalities like changeable thickness, color palette, integration with zoom and google meet etc.

hand-write Hand gesture recognition based whiteboard that allows you to write on live webcam. This is the first version and has features like 4 differ

27 Dec 16, 2022

An Open-Source Toolkit for Prompt-Learning.

An Open-Source Framework for Prompt-learning. Overview • Installation • How To Use • Docs • Paper • Citation • What's New? Nov 2021: Now we have relea

2.3k Jan 07, 2023

Gym Threat Defense

Gym Threat Defense The Threat Defense environment is an OpenAI Gym implementation of the environment defined as the toy example in Optimal Defense Pol

5 Dec 08, 2022

Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Related tags

Overview

Deformable Attention

Install

Usage

Citation

You might also like...

Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

3D2Unet: 3D Deformable Unet for Low-Light Video Enhancement (PRCV2021)

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Comments

The relationship between 'dim' and 'inner_dim'

TypeError: meshgrid() got an unexpected keyword argument 'indexing'

Releases(0.0.16)

0.0.16(Apr 21, 2022)

0.0.15(Mar 24, 2022)

0.0.14(Mar 24, 2022)

0.0.12(Mar 24, 2022)

0.0.11(Mar 24, 2022)

0.0.10(Mar 24, 2022)

0.0.9(Mar 21, 2022)

0.0.8(Mar 21, 2022)

0.0.7(Mar 21, 2022)

0.0.6(Mar 21, 2022)

0.0.5(Mar 17, 2022)

0.0.4(Mar 17, 2022)

0.0.3(Mar 17, 2022)

0.0.2(Mar 17, 2022)

0.0.1(Mar 17, 2022)