GluonMM is a library of transformer models for computer vision and multi-modality research

Last update: Dec 02, 2022

Overview

GluonMM

GluonMM is a library of transformer models for computer vision and multi-modality research. It contains reference implementations of widely adopted baseline models and also research work from Amazon Research.

Install

First, clone the repository locally,

git clone https://github.com/amazon-research/gluonmm.git

Then install dependencies,

conda create -n gluonmm python=3.7
conda activate gluonmm
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
pip install timm tensorboardX yacs tqdm requests pandas decord scikit-image opencv-python

# Install apex for half-precision training (optional)
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

We have extensively tested the usage with PyTorch 1.8.1 and torchvision 0.9.1 with CUDA 10.2.

Model zoo

Image classification

Video action recognition

VidTr

Usage

For detailed usage, please refer to the README file in each model family. For example, the training, evaluation and model zoo information of video transformer VidTr can be found at here.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Acknowledgement

Parts of the code are heavily derived from pytorch-image-models, DeiT, Swin-transformer, vit-pytorch and vision_transformer.

GluonMM is a library of transformer models for computer vision and multi-modality research

Related tags

Overview

GluonMM

Install

Model zoo

Image classification

Video action recognition

Usage

Security

License

Acknowledgement

Owner

Back to Event Basics: SSL of Image Reconstruction for Event Cameras

Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

FTIR-Deep Learning - FTIR Deep Learning With Python

Deep Learning Pipelines for Apache Spark

Pseudo-rng-app - whos needs science to make a random number when you have pseudoscience?

This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Vehicle Detection Using Deep Learning and YOLO Algorithm

ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing

GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding

Alpha-Zero - Telegram Group Manager Bot Written In Python Using Pyrogram

Easy and Efficient Object Detector

Get a Grip! - A robotic system for remote clinical environments.

SuMa++: Efficient LiDAR-based Semantic SLAM (Chen et al IROS 2019)

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Image Fusion Transformer

Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend

Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

🛠️ SLAMcore SLAM Utilities