2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Overview

2021-MICCAI-Progressively Normalized Self-Attention Network for Video Polyp Segmentation

Authors: Ge-Peng Ji*, Yu-Cheng Chou*, Deng-Ping Fan, Geng Chen, Huazhu Fu, Debesh Jha, & Ling Shao.

This repository provides code for paper"Progressively Normalized Self-Attention Network for Video Polyp Segmentation" published at the MICCAI-2021 conference (arXiv Version | 中文版). If you have any questions about our paper, feel free to contact me. And if you like our PNS-Net or evaluation toolbox for your personal research, please cite this paper (BibTeX).

Features

  • Hyper Real-time Speed: Our method, named Progressively Normalized Self-Attention Network (PNS-Net), can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single NVIDIA RTX 2080 GPU without any post-processing techniques (e.g., Dense-CRF).
  • Plug-and-Play Module: The proposed core module, termed Normalized Self-attention (NS), utilizes channel split,query-dependent, and normalization rules to reduce the computational cost and improve the accuracy, respectively. Note that this module can be flexibly plugged into any framework customed.
  • Cutting-edge Performance: Experiments on three challenging video polyp segmentation (VPS) datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance.
  • One-key Evaluation Toolbox: We release the first one-key evaluation toolbox in the VPS field.

1.1. 🔥 NEWS 🔥 :

  • [2021/06/25] 🔥 Our paper have been elected to be honred a MICCAI Student Travel Award.
  • [2021/06/19] 🔥 A short introduction of our paper is available on my YouTube channel (2min).
  • [2021/06/18] Release the inference code! The whole project will be available at the time of MICCAI-2021.
  • [2021/06/18] The Chinese translation of our paper is coming, please enjoy it [pdf].
  • [2021/05/27] Uploading the training/testing dataset, snapshot, and benchmarking results.
  • [2021/05/14] Our work is provisionally accepted at MICCAI 2021. Many thanks to my collaborator Yu-Cheng Chou and supervisor Prof. Deng-Ping Fan.
  • [2021/03/10] Create repository.

1.2. Table of Contents

Table of contents generated with markdown-toc

1.3. State-of-the-art Approaches

  1. "PraNet: Parallel Reverse Attention Network for Polyp Segmentation" MICCAI, 2020. doi: https://arxiv.org/pdf/2006.11392.pdf
  2. "Adaptive context selection for polyp segmentation" MICCAI, 2020. doi: https://link.springer.com/chapter/10.1007/978-3-030-59725-2_25
  3. "Resunet++: An advanced architecture for medical image segmentation" IEEE ISM, 2019 doi: https://arxiv.org/pdf/1911.07067.pdf
  4. "Unet++: A nested u-net architecture for medical image segmentation" IEEE TMI, 2019 doi: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329239/
  5. "U-Net: Convolutional networks for biomed- ical image segmentation" MICCAI, 2015. doi: https://arxiv.org/pdf/1505.04597.pdf

2. Overview

2.1. Introduction

Existing video polyp segmentation (VPS) models typically employ convolutional neural networks (CNNs) to extract features. However, due to their limited receptive fields, CNNs can not fully exploit the global temporal and spatial information in successive video frames, resulting in false-positive segmentation results. In this paper, we propose the novel PNS-Net (Progressively Normalized Self-attention Network), which can efficiently learn representations from polyp videos with real-time speed (~140fps) on a single RTX 2080 GPU and no post-processing.

Our PNS-Net is based solely on a basic normalized self-attention block, dispensing with recurrence and CNNs entirely. Experiments on challenging VPS datasets demonstrate that the proposed PNS-Net achieves state-of-the-art performance. We also conduct extensive experiments to study the effectiveness of the channel split, soft-attention, and progressive learning strategy. We find that our PNS-Net works well under different settings, making it a promising solution to the VPS task.

2.2. Framework Overview


Figure 1: Overview of the proposed PNS-Net, including the normalized self-attention block (see § 2.1) with a stacked (×R) learning strategy. See § 2 in the paper for details.

2.3. Qualitative Results


Figure 2: Qualitative Results.

3. Proposed Baseline

3.1. Training/Testing

The training and testing experiments are conducted using PyTorch with a single GeForce RTX 2080 GPU of 8 GB Memory.

  1. Configuring your environment (Prerequisites):

    Note that PNS-Net is only tested on Ubuntu OS with the following environments. It may work on other operating systems as well but we do not guarantee that it will.

    • Creating a virtual environment in terminal:

    conda create -n PNSNet python=3.6.

    • Installing necessary packages PyTorch 1.1:
    conda create -n PNSNet python=3.6
    conda activate PNSNet
    conda install pytorch=1.1.0 torchvision -c pytorch
    pip install tensorboardX tqdm Pillow==6.2.2
    pip install git+https://github.com/pytorch/[email protected]
    • Our core design is built on CUDA OP with torchlib. Please ensure the base CUDA toolkit version is 10.x (not at conda env), and then build the NS Block:
    cd ./lib/PNS
    python setup.py build develop
  2. Downloading necessary data:

  3. Training Configuration:

    • First, run python MyTrain_Pretrain.py in the terminal for pretraining, and then, run python MyTrain_finetune.py for finetuning.

    • Just enjoy it! Finish it and the snapshot would save in ./snapshot/PNS-Net/*.

  4. Testing Configuration:

    • After you download all the pre-trained model and testing dataset, just run MyTest_finetune.py to generate the final prediction map in ./res.

    • Just enjoy it!

    • The prediction results of all competitors and our PNS-Net can be found at Google Drive (7MB).

3.2 Evaluating your trained model:

One-key evaluation is written in MATLAB code (link), please follow this the instructions in ./eval/main_VPS.m and just run it to generate the evaluation results in ./eval-Result/.

4. Citation

Please cite our paper if you find the work useful:

@inproceedings{ji2021pnsnet,
  title={Progressively Normalized Self-Attention Network for Video Polyp Segmentation},
  author={Ji, Ge-Peng and Chou, Yu-Cheng and Fan, Deng-Ping and Chen, Geng and Jha, Debesh and Fu, Huazhu and Shao, Ling},
  booktitle={MICCAI},
  year={2021}
}

5. TODO LIST

If you want to improve the usability or any piece of advice, please feel free to contact me directly (E-mail).

  • Support NVIDIA APEX training.

  • Support different backbones ( VGGNet, ResNet, ResNeXt, iResNet, and ResNeSt etc.)

  • Support distributed training.

  • Support lightweight architecture and real-time inference, like MobileNet, SqueezeNet.

  • Support distributed training

  • Add more comprehensive competitors.

6. FAQ

  1. If the image cannot be loaded on the page (mostly in the domestic network situations).

    Solution Link


7. Acknowledgements

This code is built on SINetV2 (PyTorch) and PyramidCSA (PyTorch). We thank the authors for sharing the codes.

back to top

Owner
Ge-Peng Ji (Daniel)
Computer Vision & Medical Imaging
Ge-Peng Ji (Daniel)
LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

LightningFSL: Few-Shot Learning with Pytorch-Lightning In this repo, a number of pytorch-lightning implementations of FSL algorithms are provided, inc

Xu Luo 76 Dec 11, 2022
Python inverse kinematics for your robot model based on Pinocchio.

Python inverse kinematics for your robot model based on Pinocchio.

Stéphane Caron 50 Dec 22, 2022
The official code of Anisotropic Stroke Control for Multiple Artists Style Transfer

ASMA-GAN Anisotropic Stroke Control for Multiple Artists Style Transfer Proceedings of the 28th ACM International Conference on Multimedia The officia

Six_God 146 Nov 21, 2022
🔥3D-RecGAN in Tensorflow (ICCV Workshops 2017)

3D Object Reconstruction from a Single Depth View with Adversarial Learning Bo Yang, Hongkai Wen, Sen Wang, Ronald Clark, Andrew Markham, Niki Trigoni

Bo Yang 125 Nov 26, 2022
A Blender python script for getting asset browser custom preview images for objects and collections.

asset_snapshot A Blender python script for getting asset browser custom preview images for objects and collections. Installation: Click the code butto

Johnny Matthews 44 Nov 29, 2022
PyAF is an Open Source Python library for Automatic Time Series Forecasting built on top of popular pydata modules.

PyAF (Python Automatic Forecasting) PyAF is an Open Source Python library for Automatic Forecasting built on top of popular data science python module

CARME Antoine 405 Jan 02, 2023
ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

Xtra Computing Group 647 Jan 04, 2023
Use AI to generate a optimized stock portfolio

Use AI, Modern Portfolio Theory, and Monte Carlo simulation's to generate a optimized stock portfolio that minimizes risk while maximizing returns. Ho

Greg James 30 Dec 22, 2022
2021 credit card consuming recommendation

2021 credit card consuming recommendation

Wang, Chung-Che 7 Mar 08, 2022
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

8 Nov 14, 2022
A pytorch-based real-time segmentation model for autonomous driving

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation This project contains the Pytorch implementation for the proposed CFPNet: pap

342 Dec 22, 2022
Facial recognition project

Facial recognition project documentation Project introduction This project is developed by linuxu. It is a face model recognition project developed ba

Jefferson 2 Dec 04, 2022
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

Keon Lee 114 Dec 12, 2022
A repository for storing njxzc final exam review material

文档地址,请戳我 👈 👈 👈 ☀️ 1.Reason 大三上期末复习软件工程的时候,发现其他高校在GitHub上开源了他们学校的期末试题,我很受触动。期末

GuJiakai 2 Jan 18, 2022
AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

AOT-GAN for High-Resolution Image Inpainting Arxiv Paper | AOT-GAN: Aggregated Contextual Transformations for High-Resolution Image Inpainting Yanhong

Multimedia Research 214 Jan 03, 2023
Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Getting Started This repository contains the code used for the following publications: Probabilistic Guarantees for Safe Deep Reinforcement Learning (

Edoardo Bacci 5 Aug 31, 2022
Key information extraction from invoice document with Graph Convolution Network

Key Information Extraction from Scanned Invoices Key information extraction from invoice document with Graph Convolution Network Related blog post fro

Phan Hoang 39 Dec 16, 2022
The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting".

IGMTF The source code and data of the paper "Instance-wise Graph-based Framework for Multivariate Time Series Forecasting". Requirements The framework

Wentao Xu 24 Dec 05, 2022
Code and dataset for AAAI 2021 paper FixMyPose: Pose Correctional Describing and Retrieval Hyounghun Kim, Abhay Zala, Graham Burri, Mohit Bansal.

FixMyPose / फिक्समाइपोज़ Code and dataset for AAAI 2021 paper "FixMyPose: Pose Correctional Describing and Retrieval" Hyounghun Kim*, Abhay Zala*, Grah

4 Sep 19, 2022
Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure

Geometric Vector Perceptron Code to accompany Learning from Protein Structure with Geometric Vector Perceptrons by B Jing, S Eismann, P Suriana, RJL T

Dror Lab 85 Dec 29, 2022