Unofficial PyTorch implementation of TokenLearner by Google AI

Last update: Dec 20, 2022

Related tags

Deep Learning tokenlearner-pytorch

Overview

tokenlearner-pytorch

Unofficial PyTorch implementation of TokenLearner by Ryoo et al. from Google AI (abs, pdf)

Installation

You can install TokenLearner via pip:

pip install tokenlearner-pytorch

Usage

You can access the TokenLearner class from the tokenlearner_pytorch package. You can use this layer with a Vision Transformer, MLPMixer, or Video Vision Transformer as done in the paper.

import torch
from tokenlearner_pytorch import TokenLearner

tklr = TokenLearner(S=8)
x = torch.rand(512, 32, 32, 3)
y = tklr(x) # [512, 8, 3]

You can also use TokenLearner and TokenFuser together with Multi-head Self-Attention as done in the paper:

import torch
import torch.nn as nn
from tokenlearner_pytorch import TokenLearner, TokenFuser

mhsa = nn.MultiheadAttention(3, 1)
tklr = TokenLearner(S=8)
tkfr = TokenFuser(H=32, W=32, C=3, S=8)

x = torch.rand(512, 32, 32, 3) # a batch of images

y = tklr(x)
y = y.view(8, 512, 3)
y, _ = mhsa(y, y, y) # ignore attn weights
y = y.view(512, 8, 3)

out = tkfr(y, x) # [512, 32, 23, 3]

TODO

Add support for temporal dimension T
Implement TokenFuser with ViT
Implement TokenFuser with ViViT

Contributions

If I've made any errors or you have any suggestions, feel free to raise an Issue or PR. All contributions welcome!!

License

MIT

Unofficial PyTorch implementation of TokenLearner by Google AI

Related tags

Overview

tokenlearner-pytorch

Installation

Usage

TODO

Contributions

License

Owner

Rishabh Anand

An example showing how to use jax to train resnet50 on multi-node multi-GPU

Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

SOFT: Softmax-free Transformer with Linear Complexity, NeurIPS 2021 Spotlight

Canonical Appearance Transformations

A collection of educational notebooks on multi-view geometry and computer vision.

Code to reproduce the results for Compositional Attention

Python implementation of Wu et al (2018)'s registration fusion

Resources complimenting the Machine Learning Course led in the Faculty of mathematics and informatics part of Sofia University.

Learning from Synthetic Humans, CVPR 2017

Make your own game in a font!

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Pose estimation with MoveNet Lightning

Convert Table data to approximate values with GUI

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Pointer networks Tensorflow2

Embeddinghub is a database built for machine learning embeddings.

Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)

Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

Autolfads-tf2 - A TensorFlow 2.0 implementation of Latent Factor Analysis via Dynamical Systems (LFADS) and AutoLFADS