Scaling Vision with Sparse Mixture of Experts

Related tags

Deep Learningvmoe
Overview

Scaling Vision with Sparse Mixture of Experts

This repository contains the code for training and fine-tuning Sparse MoE models for vision (V-MoE) on ImageNet-21k, reproducing the results presented in the paper:

We will soon provide a colab analysing one of the models that we have released, as well as "config" files to train from scratch and fine-tune checkpoints. Stay tuned.

Installation

Simply clone this repository.

The file requirements.txt contains the requirements that can be installed via PyPi. However, we recommend installing jax, flax and optax directly from GitHub, since we use some of the latest features that are not part of any release yet.

In addition, you also have to clone the Vision Transformer repository, since we use some parts of it.

If you want to use RandAugment to train models (which we recommend if you train on ImageNet-21k or ILSVRC2012 from scratch), you must also clone the Cloud TPU repository, and name it cloud_tpu.

Checkpoints

We release the checkpoints containing the weights of some models that we trained on ImageNet (either ILSVRC2012 or ImageNet-21k). All checkpoints contain an index file (with .index extension) and one or multiple data files ( with extension .data-nnnnn-of-NNNNN, called shards). In the following list, we indicate only the prefix of each checkpoint. We recommend using gsutil to obtain the full list of files, download them, etc.

  • V-MoE S/32, 8 experts on the last two odd blocks, trained from scratch on ILSVRC2012 with RandAugment: gs://vmoe_checkpoints/vmoe_s32_last2_ilsvrc2012_randaug_medium.
  • V-MoE B/16, 8 experts on every odd block, trained from scratch on ImageNet-21k with RandAugment: gs://vmoe_checkpoints/vmoe_b16_imagenet21k_randaug_strong.
    • Fine-tuned on ILSVRC2012: gs://vmoe_checkpoints/vmoe_b16_imagenet21k_randaug_strong_ft_ilsvrc2012

Disclaimers

This is not an officially supported Google product.

Owner
Google Research
Google Research
Code for the paper "How Attentive are Graph Attention Networks?"

How Attentive are Graph Attention Networks? This repository is the official implementation of How Attentive are Graph Attention Networks?. The PyTorch

175 Dec 29, 2022
Imaginaire - NVIDIA's Deep Imagination Team's PyTorch Library

Imaginaire Docs | License | Installation | Model Zoo Imaginaire is a pytorch library that contains optimized implementation of several image and video

NVIDIA Research Projects 3.6k Dec 29, 2022
CausaLM: Causal Model Explanation Through Counterfactual Language Models

CausaLM: Causal Model Explanation Through Counterfactual Language Models Authors: Amir Feder, Nadav Oved, Uri Shalit, Roi Reichart Abstract: Understan

Amir Feder 39 Jul 10, 2022
Fully convolutional networks for semantic segmentation

FCN-semantic-segmentation Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], remo

Kai Arulkumaran 186 Dec 25, 2022
Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

TDEER (WIP) Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021) Overview TDEER is an e

Alipay 6 Dec 17, 2022
Using image super resolution models with vapoursynth and speeding them up with TensorRT

vs-RealEsrganAnime-tensorrt-docker Using image super resolution models with vapoursynth and speeding them up with TensorRT. Also a docker image since

4 Aug 23, 2022
Efficient and Scalable Physics-Informed Deep Learning and Scientific Machine Learning on top of Tensorflow for multi-worker distributed computing

Notice: Support for Python 3.6 will be dropped in v.0.2.1, please plan accordingly! Efficient and Scalable Physics-Informed Deep Learning Collocation-

tensordiffeq 74 Dec 09, 2022
Namish Khanna 40 Oct 11, 2022
Deep Learning and Logical Reasoning from Data and Knowledge

Logic Tensor Networks (LTN) Logic Tensor Network (LTN) is a neurosymbolic framework that supports querying, learning and reasoning with both rich data

171 Dec 29, 2022
Dist2Dec: A Simplicial Neural Network for Homology Localization

Dist2Dec: A Simplicial Neural Network for Homology Localization

Alexandros Keros 6 Jun 12, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

CS5242_2021 Neural Networks and Deep Learning, NUS CS5242, 2021 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : https:/

Xavier Bresson 165 Oct 25, 2022
FID calculation with proper image resizing and quantization steps

clean-fid: Fixing Inconsistencies in FID Project | Paper The FID calculation involves many steps that can produce inconsistencies in the final metric.

Gaurav Parmar 606 Jan 06, 2023
Reinforcement Learning Theory Book (rus)

Reinforcement Learning Theory Book (rus)

qbrick 206 Nov 27, 2022
Code for "Typilus: Neural Type Hints" PLDI 2020

Typilus A deep learning algorithm for predicting types in Python. Please find a preprint here. This repository contains its implementation (src/) and

47 Nov 08, 2022
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

Adam Van Etten 145 Jan 01, 2023
A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

A-ESRGAN: Training Real-World Blind Super-Resolution with Attention-based U-net Discriminators The authors are hidden for the purpose of double blind

77 Dec 16, 2022
Code release for NeuS

NeuS We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inpu

Peng Wang 813 Jan 04, 2023
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

In-Place Activated BatchNorm In-Place Activated BatchNorm for Memory-Optimized Training of DNNs In-Place Activated BatchNorm (InPlace-ABN) is a novel

1.3k Dec 29, 2022
Final project for Intro to CS class.

Financial Analysis Web App https://share.streamlit.io/mayurk1/fin-web-app-final-project/webApp.py 1. Project Description This project is a technical a

Mayur Khanna 1 Dec 10, 2021