The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

Related tags

Deep LearningRegSeg
Overview

RegSeg

The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

Paper: arxiv

params

D block

DBlock

Decoder

Decoder

Setup

Install the dependencies in requirements.txt by using pip and virtualenv.

Download Cityscapes

go to https://www.cityscapes-dataset.com, create an account, and download gtFine_trainvaltest.zip and leftImg8bit_trainvaltest.zip. You can delete the test images to save some space if you don't want to submit to the competition. Name the directory cityscapes_dataset. Make sure that you have downloaded the required python packages and run

CITYSCAPES_DATASET=cityscapes_dataset csCreateTrainIdLabelImgs

There are 19 classes.

Results from paper

To see the ablation studies results from the paper, go here.

Usage

To visualize your model, go to show.py. To train, validate, benchmark, and save the results of your model, go to train.py.

Results on Cityscapes server

RegSeg (exp48_decoder26, 30FPS): 78.3

Larger RegSeg (exp53_decoder29, 20 FPS): 79.5

Citation

If you find our work helpful, please consider citing our paper.

@article{gao2021rethink,
  title={Rethink Dilated Convolution for Real-time Semantic Segmentation},
  author={Gao, Roland},
  journal={arXiv preprint arXiv:2111.09957},
  year={2021}
}
Comments
  • question about STDC2-Seg75

    question about STDC2-Seg75

    Hi, I note that you benchmark the computation of STDC2-Seg75 which is not reported in the CVPR2021 paper. Did you test the speed of STDC-Seg on your own platform? How about the results?

    opened by ydhongHIT 2
  • Can not show.py

    Can not show.py

    I try show.py. But I can not.

    $ python3 show.py
    name= cityscapes
    train size: 2975
    val size: 500
    Traceback (most recent call last):
      File "show.py", line 358, in <module>
        show_cityscapes_model()
      File "show.py", line 337, in show_cityscapes_model
        show(model,val_loader,device,show_cityscapes_mask,num_images=num_images,skip=skip,images_per_line=images_per_line)
      File "show.py", line 134, in show
        outputs = model(images)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/sounansu/RegSeg/model.py", line 76, in forward
        x=self.stem(x)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/sounansu/RegSeg/blocks.py", line 22, in forward
        x = self.conv(x)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "/home/sounansu/.local/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
        return F.conv2d(input, weight, bias, self.stride,
    RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
    
    opened by sounansu 2
  • The pretrained model link

    The pretrained model link

    Hi, thank you for sharing the code. Can you provide download link about the pretrained model(exp48_decoder26 and exp53_decoder29) in Cityscapes dataset, Thank you very much!

    opened by gaowq2017 1
  • About train bug

    About train bug

    When using seg_transforms.py through your scripts 'camvid_efficientnet_b1_hyperseg-s', there always exsist 'TypeError: resize() got an unexpected keyword argument 'interpolation'' in 174 line. Does this bug only appear in this scripts and should I modify the code when using this scripts?

    opened by 870572761 0
  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

    opened by TrellixVulnTeam 0
  • About train code

    About train code

    When training, how did the miou and accuracy calculate? On train dataset or validate dataset? I think it's calculated on val dataset due to https://github.com/RolandGao/RegSeg/blob/main/train.py#L238. I trained the base regseg model with config cityscapes_trainval_1000epochs.yam on Cityscapes and got the unbelievable results. 840794c66f23deb33666dcffc4af5b5

    opened by Asthestarsfalll 6
  • confusion on field of view  and model inference time

    confusion on field of view and model inference time

    Hi, RolandGao, nice to see a good job! I see you've done a lot of experiments on the backbone setting, but I still have some confusion after reading your published paper.

    • First, You calculate the fov of 4095 to see the bottom-right pixel when training cityscape (1024x2048), so you have verify the backbone should be exp48 [ (1,1) + (1,2) + 4 * (1, 4) + 7 *(1, 14) ] with fov (3807). But I also find the same backbone when training the CamVid (720x960). Why not use a shallow backbone? I am training my own dataset with image resolution (512 x 512), do I need to modify the backbone architecture? Can you give some advice?
    • Second, I test inference time of regseg. I notice that the speed is not better than other real-time archs due to split and dilated conv even if model costs low GFLOPs. In the application, what we are concerned about is the speed, so is there any strategy to improve the speed?
    opened by LinaShanghaitech 5
  • Why not pretrain on ImageNet?

    Why not pretrain on ImageNet?

    Hi, Thanks for your excellent work ! I notice that RegSeg can achieve a high accuracy on Cityscapes without pretraining. I also did a lot of ablation studies and I think DDRNet will drop around 3% miou if they do not use ImageNet pretraining. How about trying to train your encoder on ImageNet and see what will happen? I really look forward to your result ! Thanks !

    opened by RobinhoodKi 1
Owner
Roland
University of Toronto CS 2023
Roland
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Segformer - Pytorch Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch. Install $ pip install segformer-pytorch

Phil Wang 208 Dec 25, 2022
Efficient and intelligent interactive segmentation annotation software

Efficient and intelligent interactive segmentation annotation software

294 Dec 30, 2022
Code for `BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery`, Neurips 2021

This folder contains the code for 'Scalable Variational Approaches for Bayesian Causal Discovery'. Installation To install, use conda with conda env c

14 Sep 21, 2022
Image-popularity-score - A novel deep regression method for image scoring.

Image-popularity-score - A novel deep regression method for image scoring.

Shoaib ahmed 1 Dec 26, 2021
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
HarDNeXt: Official HarDNeXt repository

HarDNeXt-Pytorch HarDNeXt: A Stage Receptive Field and Connectivity Aware Convolution Neural Network HarDNeXt-MSEG for Medical Image Segmentation in 0

5 May 26, 2022
ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 09, 2023
PlenOctrees: NeRF-SH Training & Conversion

PlenOctrees Official Repo: NeRF-SH training and conversion This repository contains code to train NeRF-SH and to extract the PlenOctree, constituting

Alex Yu 323 Dec 29, 2022
Parsing, analyzing, and comparing source code across many languages

Semantic semantic is a Haskell library and command line tool for parsing, analyzing, and comparing source code. In a hurry? Check out our documentatio

GitHub 8.6k Dec 28, 2022
Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Transformer in Transformer Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image c

Phil Wang 272 Dec 23, 2022
A general and strong 3D object detection codebase that supports more methods, datasets and tools (debugging, recording and analysis).

ALLINONE-Det ALLINONE-Det is a general and strong 3D object detection codebase built on OpenPCDet, which supports more methods, datasets and tools (de

Michael.CV 5 Nov 03, 2022
Source code for Fathony, Sahu, Willmott, & Kolter, "Multiplicative Filter Networks", ICLR 2021.

Multiplicative Filter Networks This repository contains a PyTorch MFN implementation and code to perform & reproduce experiments from the ICLR 2021 pa

Bosch Research 66 Jan 04, 2023
Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation

Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation Introduction This is a PyTorch

XMed-Lab 30 Sep 23, 2022
Official repository for Fourier model that can generate periodic signals

Conditional Generation of Periodic Signals with Fourier-Based Decoder Jiyoung Lee, Wonjae Kim, Daehoon Gwak, Edward Choi This repository provides offi

8 May 25, 2022
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

Xinyu Yi 261 Dec 31, 2022
NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM

NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM Automatic Evaluation Metric described in the papers BaryScore (EM

Pierre Colombo 28 Dec 28, 2022
Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

LUNAR Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks" Adam Goodge, Bryan Hooi, Ng See Kiong and

Adam Goodge 25 Dec 28, 2022
A synthetic texture-invariant dataset for object detection of UAVs

A synthetic dataset for object detection of UAVs This repository contains a synthetic datasets accompanying the paper Sim2Air - Synthetic aerial datas

LARICS Lab 10 Aug 13, 2022
This is the repo for the paper "Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement".

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement This is the repository for the paper "Improving the Accuracy-Memory Trad

3 Dec 29, 2022
A multi-mode modulator for multi-domain few-shot classification (ICCV)

A multi-mode modulator for multi-domain few-shot classification (ICCV)

Yanbin Liu 8 Apr 28, 2022