An implementation of the efficient attention module.

Last update: Dec 15, 2022

Overview

Efficient Attention

An implementation of the efficient attention module.

Description

Efficient attention is an attention mechanism that substantially optimizes the memory and computational efficiency while retaining exactly the same expressive power as the conventional dot-product attention. The illustration above compares the two types of attention. The efficient attention module is a drop-in replacement for the non-local module (Wang et al., 2018), while it:

uses less resources to achieve the same accuracy;
achieves higher accuracy with the same resource constraints (by allowing more insertions); and
is applicable in domains and models where the non-local module is not (due to resource constraints).

Resources

YouTube:

Presentation: https://youtu.be/_wnjhTM04NM

bilibili (for users in Mainland China):

Presentation: https://www.bilibili.com/video/BV1tK4y1f7Rm
Presentation in Chinese: https://www.bilibili.com/video/bv1Gt4y1Y7E3

Implementation details

This repository implements the efficient attention module with softmax normalization, output reprojection, and residual connection.

Features not in the paper

This repository implements additionally implements the multi-head mechanism which was not in the paper. To learn more about the mechanism, refer to Vaswani et al.

Citation

The paper will appear at WACV 2021. If you use, compare with, or refer to this work, please cite

@inproceedings{shen2021efficient,
    author = {Zhuoran Shen and Mingyuan Zhang and Haiyu Zhao and Shuai Yi and Hongsheng Li},
    title = {Efficient Attention: Attention with Linear Complexities},
    booktitle = {WACV},
    year = {2021},
}

An implementation of the efficient attention module.

Related tags

Overview

Efficient Attention

Description

Resources

Implementation details

Features not in the paper

Citation

Owner

Shen Zhuoran

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

Unified unsupervised and semi-supervised domain adaptation network for cross-scenario face anti-spoofing, Pattern Recognition

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

Image Segmentation with U-Net Algorithm on Carvana Dataset using AWS Sagemaker

RCD: Relation Map Driven Cognitive Diagnosis for Intelligent Education Systems

The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

PyTorch implementation of neural style randomization for data augmentation

maximal update parametrization (µP)

Split your patch similarly to `git add -p` but supporting multiple buckets

Code for the Shortformer model, from the paper by Ofir Press, Noah A. Smith and Mike Lewis.

TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision

Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

FasterAI: A library to make smaller and faster models with FastAI.

Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.

PyTorch implementations of deep reinforcement learning algorithms and environments

Official repository for the paper F, B, Alpha Matting

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Experiments for Fake News explainability project

A Pytorch implementation of "Manifold Matching via Deep Metric Learning for Generative Modeling" (ICCV 2021)