Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Related tags

Deep LearningViP
Overview

Visual Parser (ViP)

This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers.

Visual Parser

Key Features & TLDR

  1. PyTorch Implementation of the ViP network. Check it out at models/vip.py

  2. A fast and neat implementation of the relative positional encoding proposed in HaloNet, BOTNet and AANet.

  3. A transformer-friendly FLOPS & Param counter that supports FLOPS calculation for einsum and matmul operations.

Prerequisite

Please refer to get_started.md.

Results and Models

All models listed below are evaluated with input size 224x224

Model Top1 Acc #params FLOPS Download
ViP-Tiny 79.0 12.8M 1.7G Google Drive
ViP-Small 82.1 32.1M 4.5G Google Drive
ViP-Medium 83.3 49.6M 8.0G Coming Soon
ViP-Base 83.6 87.8M 15.0G Coming Soon

To load the pretrained checkpoint, e.g. ViP-Tiny, simply run:

# first download the checkpoint and name it as vip_t_dict.pth
from models.vip import vip_tiny
model = vip_tiny(pretrained="vip_t_dict.pth")

Evaluation

To evaluate a pre-trained ViP on ImageNet val, run:

python3 main.py <data-root> --model <model-name> -b <batch-size> --eval_checkpoint <path-to-checkpoint>

Training from scratch

To train a ViP on ImageNet from scratch, run:

bash ./distributed_train.sh <job-name> <config-path> <num-gpus>

For example, to train ViP with 8 GPU on a single node, run:

ViP-Tiny:

bash ./distributed_train.sh vip-t-001 configs/vip_t_bs1024.yaml 8

ViP-Small:

bash ./distributed_train.sh vip-s-001 configs/vip_s_bs1024.yaml 8

ViP-Medium:

bash ./distributed_train.sh vip-m-001 configs/vip_m_bs1024.yaml 8

ViP-Base:

bash ./distributed_train.sh vip-b-001 configs/vip_b_bs1024.yaml 8

Profiling the model

To measure the throughput, run:

python3 test_throughput.py <model-name>

For example, if you want to get the test speed of Vip-Tiny on your device, run:

python3 test_throughput.py vip-tiny

To measure the FLOPS and number of parameters, run:

python3 test_flops.py <model-name>

Citing ViP

@article{vip,
  title={Visual Parser: Representing Part-whole Hierarchies with Transformers},
  author={Sun, Shuyang and Yue, Xiaoyu, Bai, Song and Torr, Philip},
  journal={arXiv preprint arXiv:2107.05790},
  year={2021}
}

Contact

If you have any questions, don't hesitate to contact Shuyang (Kevin) Sun. You can easily reach him by sending an email to [email protected].

Owner
Shuyang Sun
DPhil (PhD) student at Oxford
Shuyang Sun
Face recognize system

FRS Face_recognize_system This project contains my work that target on solving some problems of FRS: Face detection: Retinaface Face anti-spoofing: Fo

Tran Anh Tuan 4 Nov 18, 2021
FS-Mol: A Few-Shot Learning Dataset of Molecules

FS-Mol is A Few-Shot Learning Dataset of Molecules, containing molecular compounds with measurements of activity against a variety of protein targets. The dataset is presented with a model evaluation

Microsoft 114 Dec 15, 2022
An investigation project for SISR.

SISR-Survey An investigation project for SISR. This repository is an official project of the paper "From Beginner to Master: A Survey for Deep Learnin

Juncheng Li 79 Oct 20, 2022
DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

DCGAN in Tensorflow Tensorflow implementation of Deep Convolutional Generative Adversarial Networks which is a stabilize Generative Adversarial Networ

Taehoon Kim 7.1k Dec 29, 2022
🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

🆕 Are you looking for a new YOLOv3 implemented by TF2.0 ? If you hate the fucking tensorflow1.x very much, no worries! I have implemented a new YOLOv

3.6k Dec 26, 2022
Implementation of QuickDraw - an online game developed by Google, combined with AirGesture - a simple gesture recognition application

QuickDraw - AirGesture Introduction Here is my python source code for QuickDraw - an online game developed by google, combined with AirGesture - a sim

Viet Nguyen 89 Dec 18, 2022
Pairwise model for commonlit competition

Pairwise model for commonlit competition To run: - install requirements - create input directory with train_folds.csv and other competition data - cd

abhishek thakur 45 Aug 31, 2022
Official implementation of Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models at NeurIPS 2021

Representer Point Selection via Local Jacobian Expansion for Classifier Explanation of Deep Neural Networks and Ensemble Models This repository is the

Yi(Amy) Sui 2 Dec 01, 2021
RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP

[Paper] [Хабр] [Model Card] [Colab] [Kaggle] RuDOLPH 🦌 🎄 ☃️ One Hyper-Modal Tr

Sber AI 230 Dec 31, 2022
MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving

MultiSiam: Self-supervised Multi-instance Siamese Representation Learning for Autonomous Driving Code will be available soon. Motivation Architecture

Kai Chen 24 Apr 19, 2022
The fastai book, published as Jupyter Notebooks

English / Spanish / Korean / Chinese / Bengali / Indonesian The fastai book These notebooks cover an introduction to deep learning, fastai, and PyTorc

fast.ai 17k Jan 07, 2023
Hard cater examples from Hopper ICLR paper

CATER-h Honglu Zhou*, Asim Kadav, Farley Lai, Alexandru Niculescu-Mizil, Martin Renqiang Min, Mubbasir Kapadia, Hans Peter Graf (*Contact: honglu.zhou

NECLA ML Group 6 May 11, 2021
A Haskell kernel for IPython.

IHaskell You can now try IHaskell directly in your browser at CoCalc or mybinder.org. Alternatively, watch a talk and demo showing off IHaskell featur

Andrew Gibiansky 2.4k Dec 29, 2022
(CVPR2021) DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation

DANNet: A One-Stage Domain Adaptation Network for Unsupervised Nighttime Semantic Segmentation CVPR2021(oral) [arxiv] Requirements python3.7 pytorch==

W-zx-Y 85 Dec 07, 2022
[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

Guangrui Li 84 Dec 26, 2022
A Python Library for Graph Outlier Detection (Anomaly Detection)

PyGOD is a Python library for graph outlier detection (anomaly detection). This exciting yet challenging field has many key applications, e.g., detect

PyGOD Team 757 Jan 04, 2023
Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.

Thomas Roddick 219 Dec 20, 2022
🏅 The Most Comprehensive List of Kaggle Solutions and Ideas 🏅

🏅 Collection of Kaggle Solutions and Ideas 🏅

Farid Rashidi 2.3k Jan 08, 2023
Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

AutoML Research 64 Dec 17, 2022
S-attack library. Official implementation of two papers "Are socially-aware trajectory prediction models really socially-aware?" and "Vehicle trajectory prediction works, but not everywhere".

S-attack library: A library for evaluating trajectory prediction models This library contains two research projects to assess the trajectory predictio

VITA lab at EPFL 71 Jan 04, 2023