Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

Overview

Auto-Seg-Loss

By Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, Jifeng Dai

This is the official implementation of the ICLR 2021 paper Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation.

Introduction

TL; DR.

Auto Seg-Loss is the first general framework for searching surrogate losses for mainstream semantic segmentation metrics.

Abstract.

Designing proper loss functions is essential in training deep networks. Especially in the field of semantic segmentation, various evaluation metrics have been proposed for diverse scenarios. Despite the success of the widely adopted cross-entropy loss and its variants, the mis-alignment between the loss functions and evaluation metrics degrades the network performance. Meanwhile, manually designing loss functions for each specific metric requires expertise and significant manpower. In this paper, we propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric. We substitute the non-differentiable operations in the metrics with parameterized functions, and conduct parameter search to optimize the shape of loss surfaces. Two constraints are introduced to regularize the search space and make the search efficient. Extensive experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently. The searched losses can generalize well to other datasets and networks.

ASL-overview ASL-results

License

This project is released under the Apache 2.0 license.

Citing Auto Seg-Loss

If you find Auto Seg-Loss useful in your research, please consider citing:

@inproceedings{li2020auto,
  title={Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation},
  author={Li, Hao and Tao, Chenxin and Zhu, Xizhou and Wang, Xiaogang and Huang, Gao and Dai, Jifeng},
  booktitle={ICLR},
  year={2021}
}

Configs

PASCAL VOC Search experiments

Target Metric mIoU FWIoU mAcc gAcc BIoU BF1
Parameterization bezier bezier bezier bezier bezier bezier
URL config config config config config config

PASCAL VOC Re-training experiments

Target Metric mIoU FWIoU mAcc gAcc BIoU BF1
Cross Entropy 78.69 91.31 87.31 95.17 70.61 65.30
ASL 80.97 91.93 92.95 95.22 79.27 74.83
URL config
log
config
log
config
log
config
log
config
log
config
log

Note:

1. The search experiments are conducted with R50-DeepLabV3+.

2. The re-training experiments are conducted with R101-DeeplabV3+.

Installation

Our implementation is based on MMSegmentation.

Prerequisites

  • Python>=3.7

    We recommend you to use Anaconda to create a conda environment:

    conda create -n auto_segloss python=3.8 -y

    Then, activate the environment:

    conda activate auto_segloss
  • PyTorch>=1.7.0, torchvision>=0.8.0 (following official instructions).

    For example, if your CUDA version is 10.1, you could install pytorch and torchvision as follows:

    conda install pytorch=1.8.0 torchvision=0.9.0 cudatoolkit=10.1 -c pytorch
  • MMCV>=1.3.0 (following official instruction).

    We recommend installing the pre-built mmcv-full. For example, if your CUDA version is 10.1 and pytorch version is 1.8.0, you could run:

    pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.8.0/index.html

Installing the modified mmsegmentation

git clone https://github.com/fundamentalvision/Auto-Seg-Loss.git
cd Auto-Seg-Loss
pip install -e .

Usage

Dataset preparation

Please follow the official guide of MMSegmentation to organize the datasets. It's highly recommended to symlink the dataset root to Auto-Seg-Loss/data. The recommended data structure is as follows:

Auto-Seg-Loss
├── mmseg
├── ASL_search
└── data
    └── VOCdevkit
        ├── VOC2012
        └── VOCaug

Training models with the provided parameters

The re-training command format is

./ASL_retrain.sh {CONFIG_NAME} [{NUM_GPUS}] [{SEED}]

For example, the command for training a ResNet-101 DeepLabV3+ with 4 GPUs for mIoU is as follows:

./ASL_retrain.sh miou_bezier_10k.py 4

You can also follow the provided configs to modify the mmsegmentation configs, and use Auto Seg-Loss for training other models on other datasets.

Searching for semantic segmentation metrics

The search command format is

./ASL_search.sh {CONFIG_NAME} [{NUM_GPUS}] [{SEED}]

For example, the command for searching for surrogate loss functions for mIoU with 8 GPUs is as follows:

./ASL_search.sh miou_bezier_lr=0.2_eps=0.2.py 8
Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees" Installa

0 Oct 13, 2021
ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet, ICCV 2021 Update: 2021/03/11: update our new results. Now our T2T-ViT-14 w

YITUTech 1k Dec 31, 2022
This is a collection of all challenges in HKCERT CTF 2021

香港網絡保安新生代奪旗挑戰賽 2021 (HKCERT CTF 2021) This is a collection of all challenges (and writeups) in HKCERT CTF 2021 Challenges ID Chinese name Name Score S

10 Jan 27, 2022
discovering subdomains, hidden paths, extracting unique links

python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give

merve 4 Sep 05, 2022
Official implementation of Pixel-Level Bijective Matching for Video Object Segmentation

BMVOS This is the official implementation of Pixel-Level Bijective Matching for Video Object Segmentation, to appear in WACV 2022. @article{cho2021pix

Suhwan Cho 13 Dec 14, 2022
OpenMMLab Pose Estimation Toolbox and Benchmark.

Introduction English | 简体中文 MMPose is an open-source toolbox for pose estimation based on PyTorch. It is a part of the OpenMMLab project. The master b

OpenMMLab 2.8k Dec 31, 2022
[NeurIPS 2021] "Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems"

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems Introduction Multi-agent control i

VITA 6 May 05, 2022
PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

PaddlePaddle Vision Transformers State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 🤖 PaddlePaddle Visual Transformers (PaddleViT or

1k Dec 28, 2022
ProjectOxford-ClientSDK - This repo has moved :house: Visit our website for the latest SDKs & Samples

This project has moved 🏠 We heard your feedback! This repo has been deprecated and each project has moved to a new home in a repo scoped by API and p

Microsoft 970 Nov 28, 2022
Code for "Learning to Regrasp by Learning to Place"

Learning2Regrasp Learning to Regrasp by Learning to Place, CoRL 2021. Introduction We propose a point-cloud-based system for robots to predict a seque

Shuo Cheng (成硕) 18 Aug 27, 2022
FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows.

Meta Incubator 272 Jan 02, 2023
A custom DeepStack model that has been trained detecting ONLY the USPS logo

This repository provides a custom DeepStack model that has been trained detecting ONLY the USPS logo. This was created after I discovered that the Deepstack OpenLogo custom model I was using did not

Stephen Stratoti 9 Dec 27, 2022
My published benchmark for a Kaggle Simulations Competition

Lux AI Working Title Bot Please refer to the Kaggle notebook for the comment section. The comment section contains my explanation on my code structure

Tong Hui Kang 29 Aug 22, 2022
Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image

Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image (Project page) Zhengqin Li, Mohammad Sha

209 Jan 05, 2023
Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

Katsuya Hyodo 8 Oct 03, 2022
official implementation for the paper "Simplifying Graph Convolutional Networks"

Simplifying Graph Convolutional Networks Updates As pointed out by #23, there was a subtle bug in our preprocessing code for the reddit dataset. After

Tianyi 727 Jan 01, 2023
TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

Ivana Balazevic 296 Dec 06, 2022
A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc. Speed up due to heavy use of slicing and mathematical simplification..

2 Nov 30, 2021
[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Graph Contrastive Learning Automated PyTorch implementation for Graph Contrastive Learning Automated [talk] [poster] [appendix] Yuning You, Tianlong C

Shen Lab at Texas A&M University 80 Nov 23, 2022
Users can free try their models on SIDD dataset based on this code

SIDD benchmark 1 Train python train.py If you want to train your network, just modify the yaml in the options folder. 2 Validation python validation.p

Yuzhi ZHAO 2 May 20, 2022