Implementation of "A MLP-like Architecture for Dense Prediction"

Related tags

Deep LearningCycleMLP
Overview

A MLP-like Architecture for Dense Prediction (arXiv)

License: MIT Python 3.8

    

Updates

  • (22/07/2021) Initial release.

Model Zoo

We provide CycleMLP models pretrained on ImageNet 2012.

Model Parameters FLOPs Top 1 Acc. Download
CycleMLP-B1 15M 2.1G 78.9% model
CycleMLP-B2 27M 3.9G 81.6% model
CycleMLP-B3 38M 6.9G 82.4% model
CycleMLP-B4 52M 10.1G 83.0% model
CycleMLP-B5 76M 12.3G 83.2% model

Usage

Install

  • PyTorch 1.7.0+ and torchvision 0.8.1+
  • timm:
pip install 'git+https://github.com/rwightman/[email protected]'

or

git clone https://github.com/rwightman/pytorch-image-models
cd pytorch-image-models
git checkout c2ba229d995c33aaaf20e00a5686b4dc857044be
pip install -e .
  • fvcore (optional, for FLOPs calculation)
  • mmcv, mmdetection, mmsegmentation (optional)

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is:

│path/to/imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

Evaluation

To evaluate a pre-trained CycleMLP-B5 on ImageNet val with a single GPU run:

python main.py --eval --model CycleMLP_B5 --resume path/to/CycleMLP_B5.pth --data-path /path/to/imagenet

Training

To train CycleMLP-B5 on ImageNet on a single node with 8 gpus for 300 epochs run:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model CycleMLP_B5 --batch-size 128 --data-path /path/to/imagenet --output_dir /path/to/save

Acknowledgement

This code is based on DeiT and pytorch-image-models. Thanks for their wonderful works

Citing

@article{chen2021cyclemlp,
  title={CycleMLP: A MLP-like Architecture for Dense Prediction},
  author={Chen, Shoufa and Xie, Enze and Ge, Chongjian and Liang, Ding and Luo, Ping},
  journal={arXiv preprint arXiv:2107.10224},
  year={2021}
}

License

CycleMLP is released under MIT License.

Comments
  • detection result

    detection result

    Applying PVT detection framework, I tried a CycleMLP-B1 based detector with RetinaNet 1x. I got AP=27.1, fairly inferior to the reported 38.6. Could you give some advices to reproduce the reported result?

    The specific configure is as follows

    base = [ 'base/models/retinanet_r50_fpn.py', 'base/datasets/coco_detection.py', 'base/schedules/schedule_1x.py', 'base/default_runtime.py' ] #optimizer model = dict( pretrained='./pretrained/CycleMLP_B1.pth', backbone=dict( type='CycleMLP_B1_feat', style='pytorch'), neck=dict( type='FPN', in_channels=[64, 128, 320, 512], out_channels=256, start_level=1, add_extra_convs='on_input', num_outs=5)) #optimizer optimizer = dict(delete=True, type='AdamW', lr=0.0001, weight_decay=0.0001) optimizer_config = dict(grad_clip=None)

    find_unused_parameters = True

    opened by mountain111 6
  • Compiling CycleMLP

    Compiling CycleMLP

    Thank you for this great repo and interesting paper.

    I tried compiling CycleMLP to onnx and not surpassingly the process failed since CycleMLP include dynamic offset creation in https://github.com/ShoufaChen/CycleMLP/blob/main/cycle_mlp.py#L132 and as such cannot be converted to a frozen graph. Were you able to convert CycleMLP to onnx or any other frozen graph framework?

    Thanks in advance.

    opened by shairoz-deci 6
  • Questions about offset calculation

    Questions about offset calculation

    Hi, thanks for your wonderful work.

    I'm currently studying your work, and come up with some question about the offset calculations.

    I understood the offset calculation mentioned on the paper, but can't understand about how generated offset is being used in the code.

    For ex) if $S_H \times S_W : 3 \times 1$; I understood how the offset is applied in this figure 스크린샷 2022-06-13 오후 9 18 20

    by calculate like this: 스크린샷 2022-06-13 오후 9 19 57

    However, when I run the offset generating code, I can't figure out how this offset is being used in deform_conv2d 스크린샷 2022-06-13 오후 9 21 57

    Can you provide more detailed information about this??

    And also, the paper contains how $S_H \times S_W: 3 \times 3$ works, but in the code, it seems like either one ofkernel_size[0] or kernel_size[1] has to be 1. So, if I want to use $S_H \times S_W : 3 \times 3$, do I have to make $3 \times 1$ and $1 \times 3$ offsets and add those together?

    Thank you again for your work. I really learned a lot.

    opened by tae-mo 5
  • Example of CycleMLP Configuration for Dense Prediction

    Example of CycleMLP Configuration for Dense Prediction

    Hello.

    First of all, thank you for curating this interesting work. I was wondering, are there any working examples of how I can use CycleMLP for dense prediction while maintaining the original input size (e.g., predict a 0 or 1 value for each pixel in an input image)? In addition, I am interested in only a single ("annotated") output image, although I noticed the model definitions given in this repository output multiple downsampled versions of the original input image. Any thoughts on this?

    Thank you in advance for your time.

    opened by amorehead 2
  • Swin-B vs CycleMLP-B on image classification

    Swin-B vs CycleMLP-B on image classification

    For classificaion on ImageNet-1k, the acuracy of Swin-B is 83.5, which is 0.1 higher than the proposed CycleMLP-B. But, in this paper, the authors reprot that the accuracy of Swin-B is 83.3, which is 0.1 lower than the proposed CycleMLP-B. Why are these accuracies different?

    opened by hkzhang91 1
  • question about the offset

    question about the offset

    Thanks for your work!

    The implementation of this code inspired me. But the calculation of offset here is confusing. Although this issue (https://github.com/ShoufaChen/CycleMLP/issues/10) has asked similar questions, I haven't found a reasonable explanation.

    https://github.com/ShoufaChen/CycleMLP/blob/2f76a1f6e3cc6672143fdac46e3db5f9a7341253/cycle_mlp.py#L127-L136

    kernel_size = (1, 3)
    start_idx = (kernel_size[0] * kernel_size[1]) // 2
    for i in range(num_channels):
        offset[0, 2 * i + 0, 0, 0] = 0
        # relative offset
        offset[0, 2 * i + 1, 0, 0] = (i + start_idx) % kernel_size[1] - (kernel_size[1] // 2)
    offset.reshape(num_channels, 2)
    
    tensor([[ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.],
            [ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.]])
    

    the results are different with the figure in paper:

    image

    Some codes for verification:

    import torch
    from torchvision.ops import deform_conv2d
    
    num_channels = 6
    
    data = torch.arange(1, 6).reshape(1, 1, 1, 5).expand(-1, num_channels, -1, -1)
    data
    """
    tensor([[[[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]],
             [[1, 2, 3, 4, 5]]]])
    """
    
    weight = torch.eye(num_channels).reshape(num_channels, num_channels, 1, 1)
    weight.reshape(num_channels, num_channels)
    """
    tensor([[1., 0., 0., 0., 0., 0.],
            [0., 1., 0., 0., 0., 0.],
            [0., 0., 1., 0., 0., 0.],
            [0., 0., 0., 1., 0., 0.],
            [0., 0., 0., 0., 1., 0.],
            [0., 0., 0., 0., 0., 1.]])
    """
    
    offset = torch.empty(1, 2 * num_channels * 1 * 1, 1, 1)
    kernel_size = (1, 3)
    start_idx = (kernel_size[0] * kernel_size[1]) // 2
    for i in range(num_channels):
        offset[0, 2 * i + 0, 0, 0] = 0
        # relative offset
        offset[0, 2 * i + 1, 0, 0] = (
            (i + start_idx) % kernel_size[1] - (kernel_size[1] // 2)
        )
    offset.reshape(num_channels, 2)
    """
    tensor([[ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.],
            [ 0.,  0.],
            [ 0.,  1.],
            [ 0., -1.]])
    """
    
    deform_conv2d(
        data.float(), 
        offset=offset.expand(-1, -1, -1, 5).float(), 
        weight=weight.float(), 
        bias=None,
    )
    """
    tensor([[[[1., 2., 3., 4., 5.]],
             [[2., 3., 4., 5., 0.]],
             [[0., 1., 2., 3., 4.]],
             [[1., 2., 3., 4., 5.]],
             [[2., 3., 4., 5., 0.]],
             [[0., 1., 2., 3., 4.]]]])
    """
    
    opened by lartpang 1
  • question about the offset

    question about the offset

    Hi, thank you very much for your excellent work. In Fig.4 of your paper, you show the pseudo-kernel when kernel size is 1x3. But I when I find that function "gen_offset" does not generate the same offset as Fig.4. The offset it generates is "0,1,0,-1,0,0,0,1..." instead of "0,1,0,-1,0,1,0,-1', which is shown in Fig.4. So could you please tell me the reason? image image

    opened by linjing7 1
  • About

    About "crop_pct"

    Hi, thanks for your great work and code. I wonder the parameter crop_pct actually works in which part of code. When I go throught the timm, I can't find out how this crop_pct is loaded.

    opened by ggjy 1
  • How to deploy CycleMLP-T for training?

    How to deploy CycleMLP-T for training?

    Thank you very much for such a wonderful work!

    After learning the cycle_mlp source code in the repository, I am very confused to deploy CycleMLP Block based on Swin Transformer. Is it convenient for you to release swin-based CycleMLP? Looking forward to your reply, Thanks!

    opened by Pak287 0
Owner
Shoufa Chen
Shoufa Chen
Fedlearn支持前沿算法研发的Python工具库 | Fedlearn algorithm toolkit for researchers

FedLearn-algo Installation Development Environment Checklist python3 (3.6 or 3.7) is required. To configure and check the development environment is c

89 Nov 14, 2022
Neural HMMs are all you need (for high-quality attention-free TTS)

Neural HMMs are all you need (for high-quality attention-free TTS) Shivam Mehta, Éva Székely, Jonas Beskow, and Gustav Eje Henter This is the official

Shivam Mehta 0 Oct 28, 2022
TimeSHAP explains Recurrent Neural Network predictions.

TimeSHAP TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and extends it to the sequential domain. TimeSHAP computes even

Feedzai 90 Dec 18, 2022
InsightFace: 2D and 3D Face Analysis Project on MXNet and PyTorch

InsightFace: 2D and 3D Face Analysis Project on MXNet and PyTorch

Deep Insight 13.2k Jan 06, 2023
A Python package to create, run, and post-process MODFLOW-based models.

Version 3.3.5 — release candidate Introduction FloPy includes support for MODFLOW 6, MODFLOW-2005, MODFLOW-NWT, MODFLOW-USG, and MODFLOW-2000. Other s

388 Nov 29, 2022
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"

GINC small-scale in-context learning dataset GINC (Generative In-Context learning Dataset) is a small-scale synthetic dataset for studying in-context

P-Lambda 29 Dec 19, 2022
ReSSL: Relational Self-Supervised Learning with Weak Augmentation

ReSSL: Relational Self-Supervised Learning with Weak Augmentation This repository contains PyTorch evaluation code, training code and pretrained model

mingkai 45 Oct 25, 2022
DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021)

DeepLM DeepLM: Large-scale Nonlinear Least Squares on Deep Learning Frameworks using Stochastic Domain Decomposition (CVPR 2021) Run Please install th

Jingwei Huang 130 Dec 02, 2022
[ArXiv 2021] Data-Efficient Instance Generation from Instance Discrimination

InsGen - Data-Efficient Instance Generation from Instance Discrimination Data-Efficient Instance Generation from Instance Discrimination Ceyuan Yang,

GenForce: May Generative Force Be with You 93 Dec 25, 2022
Do Neural Networks for Segmentation Understand Insideness?

This is part of the code to reproduce the results of the paper Do Neural Networks for Segmentation Understand Insideness? [pdf] by K. Villalobos (*),

biolins 0 Mar 20, 2021
PyTorch implementation of DARDet: A Dense Anchor-free Rotated Object Detector in Aerial Images

DARDet PyTorch implementation of "DARDet: A Dense Anchor-free Rotated Object Detector in Aerial Images", [pdf]. Highlights: 1. We develop a new dense

41 Oct 23, 2022
PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020).

Scaffold-Federated-Learning PyTorch implementation of SCAFFOLD (Stochastic Controlled Averaging for Federated Learning, ICML 2020). Environment numpy=

KI 30 Dec 29, 2022
Code for ICE-BeeM paper - NeurIPS 2020

ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA This repository contains code to run and reproduce the experiments

Ilyes Khemakhem 65 Dec 22, 2022
Google AI Open Images - Object Detection Track: Open Solution

Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c

minerva.ml 46 Jun 22, 2022
Official code for 'Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentationon Complex Urban Driving Scenes'

PEBAL This repo contains the Pytorch implementation of our paper: Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urb

Yu Tian 117 Jan 03, 2023
A package, and script, to perform imaging transcriptomics on a neuroimaging scan.

Imaging Transcriptomics Imaging transcriptomics is a methodology that allows to identify patterns of correlation between gene expression and some prop

Alessio Giacomel 10 Dec 27, 2022
Label Studio is a multi-type data labeling and annotation tool with standardized output format

Website • Docs • Twitter • Join Slack Community What is Label Studio? Label Studio is an open source data labeling tool. It lets you label data types

Heartex 11.7k Jan 09, 2023
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
Perspective: Julia for Biologists

Perspective: Julia for Biologists 1. Examples Speed: Example 1 - Single cell data and network inference Domain: Single cell data Methodology: Network

Elisabeth Roesch 55 Dec 02, 2022
Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

DCSR: Dual Camera Super-Resolution Implementation for our ICCV 2021 oral paper: Dual-Camera Super-Resolution with Aligned Attention Modules paper | pr

Tengfei Wang 110 Dec 20, 2022