CVNets: A library for training computer vision networks

This repository contains the source code for training computer vision models. Specifically, it contains the source code of the MobileViT paper for the following tasks:

Image classification on the ImageNet dataset
Object detection using SSD
Semantic segmentation using Deeplabv3

Note: Any image classification backbone can be used with object detection and semantic segmentation models

Training can be done with two samplers:

Standard distributed sampler
Mulit-scale distributed sampler

We recommend to use multi-scale sampler as it improves generalization capability and leads to better performance. See MobileViT for details.

Installation

CVNets can be installed in the local python environment using the below command:

    git clone [email protected]:apple/ml-cvnets.git
    cd ml-cvnets
    pip install -r requirements.txt
    pip install --editable .

We recommend to use Python 3.6+ and PyTorch (version >= v1.8.0) with conda environment. For setting-up python environment with conda, see here.

Getting Started

General instructions for training and evaluation different models are given here.
Examples for a training and evaluating a specific model are provided in the examples folder. Right now, we support following models.
For converting PyTorch models to CoreML, see README-pytorch-to-coreml.md.

Citation

If you find our work useful, please cite the following paper:

@article{mehta2021mobilevit,
  title={MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer},
  author={Mehta, Sachin and Rastegari, Mohammad},
  journal={arXiv preprint arXiv:2110.02178},
  year={2021}
}

CVNets: A library for training computer vision networks

Related tags

Overview

CVNets: A library for training computer vision networks

Installation

Getting Started

Citation

Owner

Apple

A little software to generate and save Julia or Mandelbrot's Fractals.

PyGCL: A PyTorch Library for Graph Contrastive Learning

Learning to Identify Top Elo Ratings with A Dueling Bandits Approach

Python binding for Khiva library.

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

A list of multi-task learning papers and projects.

PyTorch wrappers for using your model in audacity!

Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models

PyTorch Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Voice of Pajlada with model and weights.

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Codebase for testing whether hidden states of neural networks encode discrete structures.

This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

SpanNER: Named EntityRe-/Recognition as Span Prediction

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Data Augmentation with Variational Autoencoders