RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Related tags

Deep LearningRepMLP
Overview

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition (PyTorch)

Paper: https://arxiv.org/abs/2105.01883

Citation:

@article{ding2021repmlp,
title={RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition},
author={Ding, Xiaohan and Zhang, Xiangyu and Han, Jungong and Ding, Guiguang},
journal={arXiv preprint arXiv:2105.01883},
year={2021}
}

How to use the code

If you want to use RepMLP as a building block in your model, just check repmlp.py. It also shows an example of checking the equivalence between a training-time and an inference-time RepMLP. You can see that by

python repmlp.py

Just use it like this

from repmlp.py import *
your_model = YourModel(...)   # It has RepMLPs somewhere
train(your_model)
deploy_model = repmlp_model_convert(your_model)
test(deploy_model)

From repmlp_model_convert, you will see that the conversion is as simple as calling switch_to_deploy of every RepMLP.

The definition of the two block structures (RepMLP Bottleneck and RepMLP Light) are shown in repmlp_blocks.py. The RepMLP-ResNet is defined in repmlp_resnet.py.

Use our pre-trained models

You may download our pre-trained models from Google Drive or Baidu Cloud (the access key of Baidu is "rmlp").

python test.py [imagenet-folder] train RepMLP-Res50-light-224_train.pth -a RepMLP-Res50-light-224

Here imagenet-folder should contain the "train" and "val" folders. The default input resolution is 224x224. Here "train" indicates the training-time architecture.

You may convert them into the inference-time structure and test again to check the equivalence. For example

python convert.py RepMLP-Res50-light-224_train.pth RepMLP-Res50-light-224_deploy.pth -a RepMLP-Res50-light-224
python test.py [imagenet-folder] deploy RepMLP-Res50-light-224_deploy.pth -a RepMLP-Res50-light-224

Now "deploy" indicates the inference-time structure (without Local Perceptron).

Abstract

We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers. Compared to convolutional layers, FC layers are more efficient, better at modeling the long-range dependencies and positional patterns, but worse at capturing the local structures, hence usually less favored for image recognition. We propose a structural re-parameterization technique that adds local prior into an FC to make it powerful for image recognition. Specifically, we construct convolutional layers inside a RepMLP during training and merge them into the FC for inference. On CIFAR, a simple pure-MLP model shows performance very close to CNN. By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs. Our intriguing findings highlight that combining the global representational capacity and positional perception of FC with the local prior of convolution can improve the performance of neural network with faster speed on both the tasks with translation invariance (e.g., semantic segmentation) and those with aligned images and positional patterns (e.g., face recognition).

FAQs

Q: Is the inference-time model's output the same as the training-time model?

A: Yes. You can verify that by

python repmlp.py

Q: How to use RepMLP for other tasks?

A: It is better to finetune the training-time model on your datasets. Then you should do the conversion after finetuning and before you deploy the models. For example, say you want to use RepMLP-Res50 and PSPNet for semantic segmentation, you should build a PSPNet with a training-time RepMLP-Res50 as the backbone, load pre-trained weights into the backbone, and finetune the PSPNet on your segmentation dataset. Then you should convert the backbone following the code provided in this repo and keep the other task-specific structures (the PSPNet parts, in this case). The pseudo code will be like

#   train_backbone = create_xxx(deploy=False)
#   train_backbone.load_state_dict(torch.load(...))
#   train_pspnet = build_pspnet(backbone=train_backbone)
#   segmentation_train(train_pspnet)
#   deploy_pspnet = repmlp_model_convert(train_pspnet)
#   segmentation_test(deploy_pspnet)

Finetuning with a converted model also makes sense if you insert a BN after fc3, but the performance may be slightly lower.

Q: How to quantize a model with RepMLP?

A1: Post-training quantization. After training and conversion, you may quantize the converted model with any post-training quantization method. Then you may insert a BN after fc3 and finetune to recover the accuracy just like you quantize and finetune the other models. This is the recommended solution.

A2: Quantization-aware training. During the quantization-aware training, instead of constraining the params in a single kernel (e.g., making every param in {-127, -126, .., 126, 127} for int8) for ordinary models, you should constrain the equivalent kernel (get_equivalent_fc1_fc3_params() in repmlp.py).

Q: I tried to finetune your model with multiple GPUs but got an error. Why are the names of params like "stage1.0..." in the downloaded weight file but sometimes like "module.stage1.0..." (shown by nn.Module.named_parameters()) in my model?

A: DistributedDataParallel may prefix "module." to the name of params and cause a mismatch when loading weights by name. The simplest solution is to load the weights (model.load_state_dict(...)) before DistributedDataParallel(model). Otherwise, you may insert "module." before the names like this

checkpoint = torch.load(...)    # This is just a name-value dict
ckpt = {('module.' + k) : v for k, v in checkpoint.items()}
model.load_state_dict(ckpt)

Q: So a RepMLP derives the equivalent big fc kernel before each forwarding to save computations?

A: No! More precisely, we do the conversion only once right after training. Then the training-time model can be discarded, and the resultant model has no conv branches. We only save and use the resultant model.

Contact

[email protected]

Google Scholar Profile: https://scholar.google.com/citations?user=CIjw0KoAAAAJ&hl=en

My open-sourced papers and repos:

The Structural Re-parameterization Universe:

  1. (preprint, 2021) A powerful MLP-style CNN building block
    RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
    code.

  2. (CVPR 2021) A super simple and powerful VGG-style ConvNet architecture. Up to 83.55% ImageNet top-1 accuracy!
    RepVGG: Making VGG-style ConvNets Great Again
    code.

  3. (preprint, 2020) State-of-the-art channel pruning
    Lossless CNN Channel Pruning via Decoupling Remembering and Forgetting
    code.

  4. ACB (ICCV 2019) is a CNN component without any inference-time costs. The first work of our Structural Re-parameterization Universe.
    ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks.
    code.

  5. DBB (CVPR 2021) is a CNN component with higher performance than ACB and still no inference-time costs. Sometimes I call it ACNet v2 because "DBB" is 2 bits larger than "ACB" in ASCII (lol).
    Diverse Branch Block: Building a Convolution as an Inception-like Unit
    code.

Model compression and acceleration:

  1. (CVPR 2019) Channel pruning: Centripetal SGD for Pruning Very Deep Convolutional Networks with Complicated Structure
    code

  2. (ICML 2019) Channel pruning: Approximated Oracle Filter Pruning for Destructive CNN Width Optimization
    code

  3. (NeurIPS 2019) Unstructured pruning: Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
    code

Accelerated deep learning R&D

Accelerated deep learning R&D PyTorch framework for Deep Learning research and development. It focuses on reproducibility, rapid experimentation, and

Catalyst-Team 3.1k Jan 06, 2023
PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021)

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021) This repo presents PyTorch implementation of M

Evgeny 79 Dec 19, 2022
Tooling for converting STAC metadata to ODC data model

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

Open Data Cube 65 Dec 20, 2022
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

STARS Laboratory 8 Sep 14, 2022
Kinetics-Data-Preprocessing

Kinetics-Data-Preprocessing Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like Slow

Kaihua Tang 7 Oct 27, 2022
[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN video | paper | website Anycost GANs for Interactive Image Synthesis and Editing Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zh

MIT HAN Lab 726 Dec 28, 2022
Colour detection is necessary to recognize objects, it is also used as a tool in various image editing and drawing apps.

Colour Detection On Image Colour detection is the process of detecting the name of any color. Simple isn’t it? Well, for humans this is an extremely e

Astitva Veer Garg 1 Jan 13, 2022
face property detection pytorch

This is the face property train code of project face-detection-project

i am x 2 Oct 18, 2021
Imaging, analysis, and simulation software for radio interferometry

ehtim (eht-imaging) Python modules for simulating and manipulating VLBI data and producing images with regularized maximum likelihood methods. This ve

Andrew Chael 5.2k Dec 28, 2022
Source code for paper "Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling", AAAI 2021

ATLOP Code for AAAI 2021 paper Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling. If you make use of this co

Wenxuan Zhou 146 Nov 29, 2022
上海交通大学全自动抢课脚本,支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

Welcome to Course-Bullying-in-SJTU-v3.1! 2021/6/8 紧急更新v3.1 更新说明 为了更好地保护用户隐私,将原来用户名+密码的登录方式改为微信扫二维码+cookie登录方式,不再需要配置使用pytesseract。在使用扫码登录模式时,请稍等,二维码将马

87 Sep 13, 2022
PFFDTD is an open-source FDTD simulator for 3D room acoustics

PFFDTD is an open-source FDTD simulator for 3D room acoustics

Brian Hamilton 34 Nov 24, 2022
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data Au

14 Nov 28, 2022
Official Code for "Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning"

CMSF Official Code for "Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning" Requirements Python = 3.7.6 PyTorch

4 Nov 25, 2022
Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

ADGC: Awesome Deep Graph Clustering ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets).

yueliu1999 297 Dec 27, 2022
Conflict-aware Inference of Python Compatible Runtime Environments with Domain Knowledge Graph, ICSE 2022

PyCRE Conflict-aware Inference of Python Compatible Runtime Environments with Domain Knowledge Graph, ICSE 2022 Dependencies This project is developed

<a href=[email protected]"> 7 May 06, 2022
Code for "Multi-Compound Transformer for Accurate Biomedical Image Segmentation"

News The code of MCTrans has been released. if you are interested in contributing to the standardization of the medical image analysis community, plea

97 Jan 05, 2023
From Perceptron model to Deep Neural Network from scratch in Python.

Neural-Network-Basics Aim of this Repository: From Perceptron model to Deep Neural Network (from scratch) in Python. ** Currently working on a basic N

Aditya Kahol 1 Jan 14, 2022
[CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving

TransFuser This repository contains the code for the CVPR 2021 paper Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. If you find our

695 Jan 05, 2023