LBBA-boosted WSOD

Overview

LBBA-boosted WSOD

Summary

Our code is based on ruotianluo/pytorch-faster-rcnn and WSCDN

Sincerely thanks for your resources.

Newer version of our code (based on Detectron 2) work in progress.

Hardware

We use one RTX 2080Ti GPU (11GB) to train and evaluate our method, GPU with larger memory is better (e.g., TITAN RTX with 24GB memory)

Requirements

  • Python 3.6 or higher
  • CUDA 10.1 with cuDNN 7.6.2
  • PyTorch 1.2.0
  • numpy 1.18.1
  • opencv 3.4.2

We provide a full requirements.txt (namely lbba_requirements.txt) in the workspace (lbba_boosted_wsod directory).

Additional resources

Google Drive

Description

  • selective_search_data: precomputed proposals of VOC 2007/2012
  • pretrained_models/imagenet_pretrain: imagenet pretrained models of WSOD backbone/LBBA backbone
  • pretrained_models/pretrained_on_wsddn: pretrained WSOD network of VOC 2007/2012, using this pretrained model usually converges faster and more stable.
  • models/voc07: our pretrained WSOD
  • models/lbba: our pretrained LBBA
  • codes_zip: our code template of LBBA training procedure and LBBA-boosted WSOD training procedure

Prepare

Environment

We use Anaconda to construct our experimental environment.

Install all required packages (or simply follow lbba_requirements.txt).

Essential Data

We have initialized all directories with gitkeep files.

first, cd lbba_boosted_wsod

then, download selective_search_data/* into data/selective_search_data

download pretrained_models/imagenet_pretrain/* into data/imagenet_weights

download pretrained_models/pretrained_on_wsddn/* into data/wsddn_weights

Datasets

Same with rbgirshick/py-faster-rcnn

For example, PASCAL VOC 2007 dataset

  1. Download the training, validation, test data and VOCdevkit

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  2. Extract all of these tars into one directory named VOCdevkit

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
    tar xvf VOCdevkit_08-Jun-2007.tar
  3. It should have this basic structure

    $VOCdevkit/                           # development kit
    $VOCdevkit/VOCcode/                   # VOC utility code
    $VOCdevkit/VOC2007                    # image sets, annotations, etc.
    # ... and several other directories ...
  4. Create symlinks for the PASCAL VOC dataset

    cd $FRCN_ROOT/data
    ln -s $VOCdevkit VOCdevkit2007

Evaluate our WSOD

Download models/voc07/voc07_55.8.pth to lbba_boosted_wsod/

./test_voc07.sh 0 pascal_voc vgg16 voc07_55.8.pth

Note that different environments might result in a slight performance drop. For example, we obtain 55.8 mAP with CUDA 10.1 but obtain 55.5 mAP using the same code with CUDA 11.

Train WSOD

Download models/lbba/lbba_final.pth (or lbba_init.pth) to lbba_boosted_wsod/

bash train_wsod.sh 1 pascal_voc vgg16 voc07_wsddn_pre lbba_final.pth

Note that we provide different LBBA checkpoints (initialization stage, final stage, or even one-class adjuster mentioned in the suppl.).

Citation

@InProceedings{Dong_2021_ICCV,
    author    = {Dong, Bowen and Huang, Zitong and Guo, Yuelin and Wang, Qilong and Niu, Zhenxing and Zuo, Wangmeng},
    title     = {Boosting Weakly Supervised Object Detection via Learning Bounding Box Adjusters},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {2876-2885}
}
Owner
Martin Dong
HIT student, major in Computer Science and Technology. CS.CV, object detection, segmentation, generation.
Martin Dong
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Pedestron Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detec

Irtiza Hasan 594 Jan 05, 2023
An end-to-end project on customer segmentation

End-to-end Customer Segmentation Project Note: This project is in progress. Tools Used in This Project Prefect: Orchestrate workflows hydra: Manage co

Ocelot Consulting 8 Oct 06, 2022
Vision transformers (ViTs) have found only limited practical use in processing images

CXV Convolutional Xformers for Vision Vision transformers (ViTs) have found only limited practical use in processing images, in spite of their state-o

Cloudwalker 23 Sep 10, 2022
This project hosts the code for implementing the ISAL algorithm for object detection and image classification

Influence Selection for Active Learning (ISAL) This project hosts the code for implementing the ISAL algorithm for object detection and image classifi

25 Sep 11, 2022
Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks This is a Python3 / Pytorch implementation of TadGAN paper. The associated

Arun 92 Dec 03, 2022
🛰️ List of earth observation companies and job sites

Earth Observation Companies & Jobs source Portals & Jobs Geospatial Geospatial jobs newsletter: ~biweekly newsletter with geospatial jobs by Ali Ahmad

Dahn 64 Dec 27, 2022
Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)

this is a simple artificial neural network model using deep learning and torch-audio to classify cats and dog sounds.

crispengari 3 Dec 05, 2022
Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN) This code implements the skeleton-based action segmentation MS-GCN model from Autom

Benjamin Filtjens 8 Nov 29, 2022
Machine learning, in numpy

numpy-ml Ever wish you had an inefficient but somewhat legible collection of machine learning algorithms implemented exclusively in NumPy? No? Install

David Bourgin 11.6k Dec 30, 2022
PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

MNIST-to-SVHN and SVHN-to-MNIST PyTorch Implementation of CycleGAN and Semi-Supervised GAN for Domain Transfer. Prerequites Python 3.5 PyTorch 0.1.12

Yunjey Choi 401 Dec 30, 2022
DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021)

Evaluation, Training, Demo, and Inference of DeFMO DeFMO: Deblurring and Shape Recovery of Fast Moving Objects (CVPR 2021) Denys Rozumnyi, Martin R. O

Denys Rozumnyi 139 Dec 26, 2022
Automatic Image Background Subtraction

Automatic Image Background Subtraction This repo contains set of scripts for automatic one-shot image background subtraction task using the following

Oleg Sémery 6 Dec 05, 2022
Code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection"

CTDNet The PyTorch code for ACM MM2021 paper "Complementary Trilateral Decoder for Fast and Accurate Salient Object Detection" Requirements Python 3.6

CVTEAM 28 Oct 20, 2022
Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Pytorch implementation of paper Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Hrishikesh Kamath 31 Nov 20, 2022
A pre-trained language model for social media text in Spanish

RoBERTuito A pre-trained language model for social media text in Spanish READ THE FULL PAPER Github Repository RoBERTuito is a pre-trained language mo

25 Dec 29, 2022
π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis

π-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis Project Page | Paper | Data Eric Ryan Chan*, Marco Monteiro*, Pe

375 Dec 31, 2022
Kinetics-Data-Preprocessing

Kinetics-Data-Preprocessing Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like Slow

Kaihua Tang 7 Oct 27, 2022
CS50x-AI - Artificial Intelligence with Python from Harvard University

CS50x-AI Artificial Intelligence with Python from Harvard University 📖 Table of

Hosein Damavandi 6 Aug 22, 2022
Pose estimation for iOS and android using TensorFlow 2.0

💃 Mobile 2D Single Person (Or Your Own Object) Pose Estimation for TensorFlow 2.0 This repository is forked from edvardHua/PoseEstimationForMobile wh

tucan9389 165 Nov 16, 2022
RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds

RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds This repository contains the code asscoiated

Felix Hensel 14 Dec 12, 2022