LBBA-boosted WSOD

Last update: Sep 19, 2022

Related tags

Deep Learning lbba_boosted_wsod

Overview

LBBA-boosted WSOD

Summary

Our code is based on ruotianluo/pytorch-faster-rcnn and WSCDN

Sincerely thanks for your resources.

Newer version of our code (based on Detectron 2) work in progress.

Hardware

We use one RTX 2080Ti GPU (11GB) to train and evaluate our method, GPU with larger memory is better (e.g., TITAN RTX with 24GB memory)

Requirements

Python 3.6 or higher
CUDA 10.1 with cuDNN 7.6.2
PyTorch 1.2.0
numpy 1.18.1
opencv 3.4.2

We provide a full requirements.txt (namely lbba_requirements.txt) in the workspace (lbba_boosted_wsod directory).

Additional resources

Google Drive

Description

selective_search_data: precomputed proposals of VOC 2007/2012
pretrained_models/imagenet_pretrain: imagenet pretrained models of WSOD backbone/LBBA backbone
pretrained_models/pretrained_on_wsddn: pretrained WSOD network of VOC 2007/2012, using this pretrained model usually converges faster and more stable.
models/voc07: our pretrained WSOD
models/lbba: our pretrained LBBA
codes_zip: our code template of LBBA training procedure and LBBA-boosted WSOD training procedure

Prepare

Environment

We use Anaconda to construct our experimental environment.

Install all required packages (or simply follow lbba_requirements.txt).

Essential Data

We have initialized all directories with gitkeep files.

first, cd lbba_boosted_wsod

then, download selective_search_data/* into data/selective_search_data

download pretrained_models/imagenet_pretrain/* into data/imagenet_weights

download pretrained_models/pretrained_on_wsddn/* into data/wsddn_weights

Datasets

Same with rbgirshick/py-faster-rcnn

For example, PASCAL VOC 2007 dataset

Download the training, validation, test data and VOCdevkit

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

Extract all of these tars into one directory named VOCdevkit

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar

It should have this basic structure

$VOCdevkit/                           # development kit
$VOCdevkit/VOCcode/                   # VOC utility code
$VOCdevkit/VOC2007                    # image sets, annotations, etc.
# ... and several other directories ...

Create symlinks for the PASCAL VOC dataset

cd $FRCN_ROOT/data
ln -s $VOCdevkit VOCdevkit2007

Evaluate our WSOD

Download models/voc07/voc07_55.8.pth to lbba_boosted_wsod/

./test_voc07.sh 0 pascal_voc vgg16 voc07_55.8.pth

Note that different environments might result in a slight performance drop. For example, we obtain 55.8 mAP with CUDA 10.1 but obtain 55.5 mAP using the same code with CUDA 11.

Train WSOD

Download models/lbba/lbba_final.pth (or lbba_init.pth) to lbba_boosted_wsod/

bash train_wsod.sh 1 pascal_voc vgg16 voc07_wsddn_pre lbba_final.pth

Note that we provide different LBBA checkpoints (initialization stage, final stage, or even one-class adjuster mentioned in the suppl.).

Citation

@InProceedings{Dong_2021_ICCV,
    author    = {Dong, Bowen and Huang, Zitong and Guo, Yuelin and Wang, Qilong and Niu, Zhenxing and Zuo, Wangmeng},
    title     = {Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {2876-2885}
}

LBBA-boosted WSOD

Related tags

Overview

LBBA-boosted WSOD

Summary

Hardware

Requirements

Additional resources

Description

Prepare

Environment

Essential Data

Datasets

Evaluate our WSOD

Train WSOD

Citation

Owner

Martin Dong

Pytorch code for our paper "Feedback Network for Image Super-Resolution" (CVPR2019)

Official pytorch code for SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

DANA paper supplementary materials

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations"

Convolutional Neural Networks

A Pytorch Implementation of ClariNet

Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

Deep Learning Package based on TensorFlow

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

InvTorch: memory-efficient models with invertible functions

Official PyTorch implementation of "BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation" (NeurIPS 2021)

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

Fast Differentiable Matrix Sqrt Root

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

LBBA-boosted WSOD

Related tags

Overview

LBBA-boosted WSOD

Summary

Hardware

Requirements

Additional resources

Description

Prepare

Environment

Essential Data

Datasets

Evaluate our WSOD

Train WSOD

Citation

Owner

Martin Dong

Pytorch code for our paper "Feedback Network for Image Super-Resolution" (CVPR2019)

Official pytorch code for SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

DANA paper supplementary materials

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

FusionNet: A deep fully residual convolutional neural network for image segmentation in connectomics

These are the materials for the paper "Few-Shot Out-of-Domain Transfer Learning of Natural Language Explanations"

Convolutional Neural Networks

A Pytorch Implementation of ClariNet

Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

Deep Learning Package based on TensorFlow

[CVPR2021] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

InvTorch: memory-efficient models with invertible functions

Official PyTorch implementation of "BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation" (NeurIPS 2021)

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

Fast Differentiable Matrix Sqrt Root

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.