Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

Overview

CSE-Autoloss

Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models. For object detection, the well-established classification and regression loss functions have been carefully designed by considering diverse learning challenges (e.g. class imbalance, hard negative samples, and scale variances). Inspired by the recent progress in network architecture search, it is interesting to explore the possibility of discovering new loss function formulations via directly searching the primitive operation combinations. So that the learned losses not only fit for diverse object detection challenges to alleviate huge human efforts, but also have better alignment with evaluation metric and good mathematical convergence property. Beyond the previous auto-loss works on face recognition and image classification, our work makes the first attempt to discover new loss functions for the challenging object detection from primitive operation levels and finds the searched losses are insightful. We propose an effective convergence-simulation driven evolutionary search algorithm, called CSE-Autoloss, for speeding up the search progress by regularizing the mathematical rationality of loss candidates via two progressive convergence simulation modules: convergence property verification and model optimization simulation. The best-discovered loss function combinations CSE-Autoloss-A and CSE-Autoloss-B outperform default combinations (Cross-entropy/Focal loss for classification and L1 loss for regression) by 1.1% and 0.8% in terms of mAP for two-stage and one-stage detectors on COCO respectively.

The repository contains the demo training scripts for the best-searched loss combinations of our paper (ICLR2021) Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search.

Installation

Please refer to get_started.md for installation.

Get Started

Please see get_started.md for the basic usage of MMDetection.

Searched Loss

Two-Stage Best-Discovered Loss

CSE_Autoloss_A_cls='Neg(Dot(Mul(Y,Add(1,Sin(Z))),Log(Softmax(X))))'

CSE_Autoloss_A_reg='Add(1,Neg(Add(Div(I,U),Neg(Div(Add(E,Neg(Add(I,2))),E)))))'

One-Stage Best-Discovered Loss

CSE_Autoloss_B_cls='Neg(Add(Mul(Q,Mul(Add(1,Serf(Sig(NY))),Log(Sig(X)))),Mul(Add(Sgdf(X),Neg(Q)),Mul(Add(Add(1,Neg(Q)),Neg(Add(1,Neg(Sig(X))))),Log(Add(1,Neg(Sig(X))))))))'

CSE_Autoloss_B_reg='Neg(Div(Add(Div(Neg(Add(Neg(E),Add(1,I))),Neg(Add(3,Add(2,U)))),Add(Div(E,E),Div(Neg(E),Neg(1)))),Neg(Add(Div(Neg(Add(U,Div(I,1))),Neg(3)),Neg(E)))))'

[1] u, i, e, w indicate union, intersection, enclose and intersection-over-union (IoU) between bounding box prediction and groundtruth. x, y are for class prediction and label.
[2] dot is for dot product, erf is for scaled error function, gd is for scaled gudermannian function. Please see more details about "S"-shaped curve at wiki.

Performance

Performance for COCO val are as follows.

Detector Loss Bbox mAP Command
Faster R-CNN R50 CSE-Autoloss-A 38.5% Link
Faster R-CNN R101 CSE-Autoloss-A 40.2% Link
Cascade R-CNN R50 CSE-Autoloss-A 40.5% Link
Mask R-CNN R50 CSE-Autoloss-A 39.1% Link
FCOS R50 CSE-Autoloss-B 39.6% Link
ATSS R50 CSE-Autoloss-B 40.5% Link

[1] We replace the centerness_target in FCOS and ATSS to the IoU between bbox_pred and bbox_target. Please see more details at fcos_head.py and atss_head.py.

[2] For the search loss combinations, loss_bbox weight for ATSS sets to 1 (instead of 2). Please see more details here.

Quick start to train the model with searched/default loss combinations

# cls - classification, reg - regression

# Train with searched classification loss and searched regression loss
python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT ./tools/train.py $CONFIG --loss_cls $SEARCH_CLS_LOSS --loss_reg $SEARCH_REG_LOSS --launcher pytorch;

# Train with searched classification loss and default regression loss
python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT ./tools/train.py $CONFIG --loss_cls $SEARCH_CLS_LOSS --launcher pytorch;

# Train with default classification loss and searched regression loss
python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT ./tools/train.py $CONFIG --loss_reg $SEARCH_REG_LOSS --launcher pytorch;

# Train with default classification loss and default regression loss
python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$PORT ./tools/train.py $CONFIG --launcher pytorch;

Acknowledgement

Thanks to MMDetection Team for their powerful deep learning detection framework. Thanks to Huawei Noah's Ark Lab AI Theory Group for their numerous V100 GPUs.

Citation

If you use this toolbox or benchmark in your research, please cite this project.

@inproceedings{
  liu2021loss,
  title={Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search},
  author={Peidong Liu and Gengwei Zhang and Bochao Wang and Hang Xu and Xiaodan Liang and Yong Jiang and Zhenguo Li},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=5jzlpHvvRk}
}
@article{mmdetection,
  title   = {{MMDetection}: Open MMLab Detection Toolbox and Benchmark},
  author  = {Chen, Kai and Wang, Jiaqi and Pang, Jiangmiao and Cao, Yuhang and
             Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and
             Liu, Ziwei and Xu, Jiarui and Zhang, Zheng and Cheng, Dazhi and
             Zhu, Chenchen and Cheng, Tianheng and Zhao, Qijie and Li, Buyu and
             Lu, Xin and Zhu, Rui and Wu, Yue and Dai, Jifeng and Wang, Jingdong
             and Shi, Jianping and Ouyang, Wanli and Loy, Chen Change and Lin, Dahua},
  journal= {arXiv preprint arXiv:1906.07155},
  year={2019}
}
Owner
Peidong Liu(刘沛东)
Master Student in CS @ Tsinghua University. My research interest lies in scene understanding, visual tracking and AutoML for loss function.
Peidong Liu(刘沛东)
A python library for highly configurable transformers - easing model architecture search and experimentation.

A python library for highly configurable transformers - easing model architecture search and experimentation.

Anthony Fuller 51 Nov 20, 2022
PyTorch implementation of SimSiam: Exploring Simple Siamese Representation Learning

SimSiam: Exploring Simple Siamese Representation Learning This is a PyTorch implementation of the SimSiam paper: @Article{chen2020simsiam, author =

Facebook Research 834 Dec 30, 2022
On Out-of-distribution Detection with Energy-based Models

On Out-of-distribution Detection with Energy-based Models This repository contains the code for the experiments conducted in the paper On Out-of-distr

Sven 19 Aug 07, 2022
A pre-trained model with multi-exit transformer architecture.

ElasticBERT This repository contains finetuning code and checkpoints for ElasticBERT. Towards Efficient NLP: A Standard Evaluation and A Strong Baseli

fastNLP 48 Dec 14, 2022
2D Human Pose estimation using transformers. Implementation in Pytorch

PE-former: Pose Estimation Transformer Vision transformer architectures perform very well for image classification tasks. Efforts to solve more challe

Panteleris Paschalis 23 Oct 17, 2022
PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Zechen Bai 12 Jul 08, 2022
GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

GalaXC GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification @InProceedings{Saini21, author = {Saini, D. and Jain,

Extreme Classification 28 Dec 05, 2022
Graph Convolutional Networks in PyTorch

Graph Convolutional Networks in PyTorch PyTorch implementation of Graph Convolutional Networks (GCNs) for semi-supervised classification [1]. For a hi

Thomas Kipf 4.5k Dec 31, 2022
Contrastively Disentangled Sequential Variational Audoencoder

Contrastively Disentangled Sequential Variational Audoencoder (C-DSVAE) Overview This is the implementation for our C-DSVAE, a novel self-supervised d

Junwen Bai 35 Dec 24, 2022
PyTorch implementation of "Contrast to Divide: self-supervised pre-training for learning with noisy labels"

Contrast to Divide: self-supervised pre-training for learning with noisy labels This is an official implementation of "Contrast to Divide: self-superv

55 Nov 23, 2022
Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Deep3DMM Official repository for the CVPR 2021 paper Learning Feature Aggregation for Deep 3D Morphable Models. Requirements This code is tested on Py

38 Dec 27, 2022
Code in PyTorch for the convex combination linear IAF and the Householder Flow, J.M. Tomczak & M. Welling

VAE with Volume-Preserving Flows This is a PyTorch implementation of two volume-preserving flows as described in the following papers: Tomczak, J. M.,

Jakub Tomczak 87 Dec 26, 2022
Exploit ILP to learn symmetry breaking constraints of ASP programs.

ILP Symmetry Breaking Overview This project aims to exploit inductive logic programming to lift symmetry breaking constraints of ASP programs. Given a

Research Group Production Systems 1 Apr 13, 2022
Visual Memorability for Robotic Interestingness via Unsupervised Online Learning (ECCV 2020 Oral and TRO)

Visual Interestingness Refer to the project description for more details. This code based on the following paper. Chen Wang, Yuheng Qiu, Wenshan Wang,

Chen Wang 36 Sep 08, 2022
A list of all named GANs!

The GAN Zoo Every week, new GAN papers are coming out and it's hard to keep track of them all, not to mention the incredibly creative ways in which re

Avinash Hindupur 12.9k Jan 08, 2023
Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

Pose Detection Project Description: Human pose estimation from video plays a critical role in various applications such as quantifying physical exerci

Hassan Shahzad 2 Jan 17, 2022
Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

SCAI-QReCC-21 [leaderboards] [registration] [forum] [contact] [SCAI] Answer a series of contextually-dependent questions like they may occur in natura

19 Sep 28, 2022
Unsupervised Learning of Video Representations using LSTMs

Unsupervised Learning of Video Representations using LSTMs Code for paper Unsupervised Learning of Video Representations using LSTMs by Nitish Srivast

Elman Mansimov 341 Dec 20, 2022
Implementation of MA-Trace - a general-purpose multi-agent RL algorithm for cooperative environments.

Off-Policy Correction For Multi-Agent Reinforcement Learning This repository is the official implementation of Off-Policy Correction For Multi-Agent R

4 Aug 18, 2022
BARF: Bundle-Adjusting Neural Radiance Fields 🤮 (ICCV 2021 oral)

BARF 🤮 : Bundle-Adjusting Neural Radiance Fields Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, and Simon Lucey IEEE International Conference on Comp

Chen-Hsuan Lin 539 Dec 28, 2022