Semi-Supervised Learning, Object Detection, ICCV2021

Last update: Dec 27, 2022

Overview

End-to-End Semi-Supervised Object Detection with Soft Teacher

By Mengde Xu*, Zheng Zhang*, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, Zicheng Liu.

This repo is the official implementation of ICCV2021 paper "End-to-End Semi-Supervised Object Detection with Soft Teacher".

Citation

@article{xu2021end,
  title={End-to-End Semi-Supervised Object Detection with Soft Teacher},
  author={Xu, Mengde and Zhang, Zheng and Hu, Han and Wang, Jianfeng and Wang, Lijuan and Wei, Fangyun and Bai, Xiang and Liu, Zicheng},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

Main Results

Partial Labeled Data

We followed STAC[1] to evaluate on 5 different data splits for each setting, and report the average performance of 5 splits. The results are shown in the following:

1% labeled data

Method	mAP	Model Weights	Config Files
Baseline	10.0	-	Config
Ours (thr=5e-2)	21.62	Drive	Config
Ours (thr=1e-3)	22.64	Drive	Config

5% labeled data

Method	mAP	Model Weights	Config Files
Baseline	20.92	-	Config
Ours (thr=5e-2)	30.42	Drive	Config
Ours (thr=1e-3)	31.7	Drive	Config

10% labeled data

Method	mAP	Model Weights	Config Files
Baseline	26.94	-	Config
Ours (thr=5e-2)	33.78	Drive	Config
Ours (thr=1e-3)	34.7	Drive	Config

Full Labeled Data

Faster R-CNN (ResNet-50)

Model	mAP	Model Weights	Config Files
Baseline	40.9	-	Config
Ours (thr=5e-2)	44.05	Drive	Config
Ours (thr=1e-3)	44.6	Drive	Config
Ours* (thr=5e-2)	44.5	-	Config
Ours* (thr=1e-3)	44.9	-	Config

Faster R-CNN (ResNet-101)

Model	mAP	Model Weights	Config Files
Baseline	43.8	-	Config
Ours* (thr=5e-2)	46.8	-	Config
Ours* (thr=1e-3)	47.3	-	Config

Notes

Ours* means we use longer training schedule.
thr indicates model.test_cfg.rcnn.score_thr in config files. This inference trick was first introduced by Instant-Teaching[2].
All models are trained on 8*V100 GPUs

Usage

Requirements

Ubuntu 16.04
Anaconda3 with python=3.6
Pytorch=1.9.0
mmdetection=2.16.0+fe46ffe
mmcv=1.3.9
wandb=0.10.31

Notes

We use wandb for visualization, if you don't want to use it, just comment line 273-284 in configs/soft_teacher/base.py.

Installation

make install

Data Preparation

Download the COCO dataset
Execute the following command to generate data set splits:

# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
#  coco/
#     train2017/
#     val2017/
#     unlabeled2017/
#     annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct

Training

To train model on the partial labeled data setting:

# JOB_TYPE: 'baseline' or 'semi', decide which kind of job to run
# PERCENT_LABELED_DATA: 1, 5, 10. The ratio of labeled coco data in whole training dataset.
# GPU_NUM: number of gpus to run the job
for FOLD in 1 2 3 4 5;
do
  bash tools/dist_train_partially.sh <JOB_TYPE> ${FOLD} <PERCENT_LABELED_DATA> <GPU_NUM>
done

For example, we could run the following scripts to train our model on 10% labeled data with 8 GPUs:

for FOLD in 1 2 3 4 5;
do
  bash tools/dist_train_partially.sh semi ${FOLD} 10 8
done

To train model on the full labeled data setting:

bash tools/dist_train.sh <CONFIG_FILE_PATH> <NUM_GPUS>

For example, to train ours R50 model with 8 GPUs:

bash tools/dist_train.sh configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py 8

Evaluation

bash tools/dist_test.sh <CONFIG_FILE_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval bbox --cfg-options model.test_cfg.rcnn.score_thr=<THR>

Inference

To inference with trained model and visualize the detection results:

# [IMAGE_FILE_PATH]: the path of your image file in local file system
# [CONFIG_FILE]: the path of a confile file
# [CHECKPOINT_PATH]: the path of a trained model related to provided confilg file.
# [OUTPUT_PATH]: the directory to save detection result
python demo/image_demo.py [IMAGE_FILE_PATH] [CONFIG_FILE] [CHECKPOINT_PATH] --output [OUTPUT_PATH]

For example:

Inference on single image with provided R50 model:

python demo/image_demo.py /tmp/tmp.png configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py work_dirs/downloaded.model --output work_dirs/

After the program completes, a image with the same name as input will be saved to work_dirs

Inference on many images with provided R50 model:

python demo/image_demo.py '/tmp/*.jpg' configs/soft_teacher/soft_teacher_faster_rcnn_r50_caffe_fpn_coco_full_720k.py work_dirs/downloaded.model --output work_dirs/

[1] A Simple Semi-Supervised Learning Framework for Object Detection

[2] Instant-Teaching: An End-to-End Semi-SupervisedObject Detection Framework

Semi-Supervised Learning, Object Detection, ICCV2021

Related tags

Overview

End-to-End Semi-Supervised Object Detection with Soft Teacher

Citation

Main Results

Partial Labeled Data

1% labeled data

5% labeled data

10% labeled data

Full Labeled Data

Faster R-CNN (ResNet-50)

Faster R-CNN (ResNet-101)

Notes

Usage

Requirements

Notes

Installation

Data Preparation

Training

Evaluation

Inference

Owner

Microsoft

Learning to Estimate Hidden Motions with Global Motion Aggregation

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

State-of-the-art data augmentation search algorithms in PyTorch

Official implementation of NLOS-OT: Passive Non-Line-of-Sight Imaging Using Optimal Transport (IEEE TIP, accepted)

A package, and script, to perform imaging transcriptomics on a neuroimaging scan.

The project was to detect traffic signs, based on the Megengine framework.

Tensorflow implementation of DeepLabv2

Official Implementation for the "An Empirical Investigation of 3D Anomaly Detection and Segmentation" paper.

BrainGNN - A deep learning model for data-driven discovery of functional connectivity

Multi-Scale Geometric Consistency Guided Multi-View Stereo

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter

Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Help you understand Manual and w/ Clutch point while driving.

CS550 Machine Learning course project on CNN Detection.

Code for CVPR2019 paper《Unequal Training for Deep Face Recognition with Long Tailed Noisy Data》

PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

A PyTorch implementation of deep-learning-based registration

Code for "Unsupervised State Representation Learning in Atari"