Unofficial PyTorch implementation of TokenLearner by Google AI

Last update: Dec 20, 2022

Related tags

Deep Learning tokenlearner-pytorch

Overview

tokenlearner-pytorch

Unofficial PyTorch implementation of TokenLearner by Ryoo et al. from Google AI (abs, pdf)

Installation

You can install TokenLearner via pip:

pip install tokenlearner-pytorch

Usage

You can access the TokenLearner class from the tokenlearner_pytorch package. You can use this layer with a Vision Transformer, MLPMixer, or Video Vision Transformer as done in the paper.

import torch
from tokenlearner_pytorch import TokenLearner

tklr = TokenLearner(S=8)
x = torch.rand(512, 32, 32, 3)
y = tklr(x) # [512, 8, 3]

You can also use TokenLearner and TokenFuser together with Multi-head Self-Attention as done in the paper:

import torch
import torch.nn as nn
from tokenlearner_pytorch import TokenLearner, TokenFuser

mhsa = nn.MultiheadAttention(3, 1)
tklr = TokenLearner(S=8)
tkfr = TokenFuser(H=32, W=32, C=3, S=8)

x = torch.rand(512, 32, 32, 3) # a batch of images

y = tklr(x)
y = y.view(8, 512, 3)
y, _ = mhsa(y, y, y) # ignore attn weights
y = y.view(512, 8, 3)

out = tkfr(y, x) # [512, 32, 23, 3]

TODO

Add support for temporal dimension T
Implement TokenFuser with ViT
Implement TokenFuser with ViViT

Contributions

If I've made any errors or you have any suggestions, feel free to raise an Issue or PR. All contributions welcome!!

License

MIT

Unofficial PyTorch implementation of TokenLearner by Google AI

Related tags

Overview

tokenlearner-pytorch

Installation

Usage

TODO

Contributions

License

Owner

Rishabh Anand

Replication of Pix2Seq with Pretrained Model

Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

particle tracking model, works with the ROMS output file(qck.nc, his.nc)

Language Used: Python . Made in Jupyter(Anaconda) notebook.

ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral)

OntoProtein: Protein Pretraining With Ontology Embedding

Learning hierarchical attention for weakly-supervised chest X-ray abnormality localization and diagnosis

GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

Semantic Segmentation for Aerial Imagery using Convolutional Neural Network

MLPs for Vision and Langauge Modeling (Coming Soon)

Evaluating deep transfer learning for whole-brain cognitive decoding

Transformer based SAR image despeckling

Converts geometry node attributes to built-in attributes

Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.

This is an implementation of Googles Yogi-Optimizer in Keras (tf.keras)

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Code of Puregaze: Purifying gaze feature for generalizable gaze estimation, AAAI 2022.

This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine

SwinIR: Image Restoration Using Swin Transformer

A new play-and-plug method of controlling an existing generative model with conditioning attributes and their compositions.