Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

Related tags

Deep LearningIVR
Overview

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

99% of the code in this repository originates from this link.

ICCV 2021 paper

Jeesoo Kim1, Junsuk Choe2, Sangdoo Yun3, Nojun Kwak1

1 Seoul National University 2 Sogang University 3 Naver AI Lab

Weakly-supervised object localization (WSOL) enables finding an object using a dataset without any localization information. By simply training a classification model using only image-level annotations, the feature map of the model can be utilized as a score map for localization. In spite of many WSOL methods proposing novel strategies, there has not been any de facto standard about how to normalize the class activation map (CAM). Consequently, many WSOL methods have failed to fully exploit their own capacity because of the misuse of a normalization method. In this paper, we review many existing normalization methods and point out that they should be used according to the property of the given dataset. Additionally, we propose a new normalization method which substantially enhances the performance of any CAM-based WSOL methods. Using the proposed normalization method, we provide a comprehensive evaluation over three datasets (CUB, ImageNet and OpenImages) on three different architectures and observe significant performance gains over the conventional min-max normalization method in all the evaluated cases.

RubberDuck

Re-evaluated performance of several WSOL methods using different normalization methods. Comparison of several WSOL methods with different kinds of normalization methods for a class activation map. The accuracy has been evaluated under MaxBoxAccV2 with CUB-200-2011 dataset. All scores in this figure are the average scores of ResNet50, VGG16, and InceptionV3. In all WSOL methods, the performance using our normalization method, IVR, is the best.

Prerequisite

Dataset preparation, Code dependencies are available in the original repository. [Evaluating Weakly Supervised Object Localization Methods Right (CVPR 2020)] (paper)
This repository is highly dependent on this repo and we highly recommend users to refer the original one.

Licenses

The licenses corresponding to the dataset are summarized as follows

Dataset Images Class Annotations Localization Annotations
ImageNetV2 See the original Github See the original Github CC-BY-2.0 NaverCorp.
CUBV2 Follows original image licenses. See here. CC-BY-2.0 NaverCorp. CC-BY-2.0 NaverCorp.
OpenImages CC-BY-2.0 (Follows original image licenses. See here) CC-BY-4.0 Google LLC CC-BY-4.0 Google LLC

Detailed license files are summarized in the release directory.

Note: At the time of collection, images were marked as being licensed under the following licenses:

Attribution-NonCommercial License
Attribution License
Public Domain Dedication (CC0)
Public Domain Mark

However, we make no representations or warranties regarding the license status of each image. You should verify the license for each image yourself.

WSOL training and evaluation

We additionally support the following normalization methods:

  • Normalization.
    • Min-max
    • Max
    • PaS
    • IVR

Below is an example command line for the train+eval script.

python main.py --dataset_name CUB \
               --architecture vgg16 \
               --wsol_method cam \
               --experiment_name CUB_vgg16_CAM \
               --pretrained TRUE \
               --num_val_sample_per_class 5 \
               --large_feature_map FALSE \
               --batch_size 32 \
               --epochs 50 \
               --lr 0.00001268269 \
               --lr_decay_frequency 15 \
               --weight_decay 5.00E-04 \
               --override_cache FALSE \
               --workers 4 \
               --box_v2_metric True \
               --iou_threshold_list 30 50 70 \
               --eval_checkpoint_type last
               --norm_method ivr

See config.py for the full descriptions of the arguments, especially the method-specific hyperparameters.

Experimental results

Details about experiments are available in the paper.

Code license

This project is distributed under MIT license.

Copyright (c) 2020-present NAVER Corp.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

5. Citation

@article{kim2021normalization,
  title={Normalization Matters in Weakly Supervised Object Localization},
  author={Kim, Jeesoo and Choe, Junsuk and Yun, Sangdoo and Kwak, Nojun},
  journal={arXiv preprint arXiv:2107.13221},
  year={2021}
}
@inproceedings{choe2020cvpr,
  title={Evaluating Weakly Supervised Object Localization Methods Right},
  author={Choe, Junsuk and Oh, Seong Joon and Lee, Seungho and Chun, Sanghyuk and Akata, Zeynep and Shim, Hyunjung},
  year = {2020},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  note = {to appear},
  pubstate = {published},
  tppubtype = {inproceedings}
}
@article{wsol_eval_journal_submission,
  title={Evaluation for Weakly Supervised Object Localization: Protocol, Metrics, and Datasets},
  author={Choe, Junsuk and Oh, Seong Joon and Chun, Sanghyuk and Akata, Zeynep and Shim, Hyunjung},
  journal={arXiv preprint arXiv:2007.04178},
  year={2020}
}
Owner
Jeesoo Kim
Ph.D candidate at Seoul National University
Jeesoo Kim
Numerical-computing-is-fun - Learning numerical computing with notebooks for all ages.

As much as this series is to educate aspiring computer programmers and data scientists of all ages and all backgrounds, it is also a reminder to mysel

EKA foundation 758 Dec 25, 2022
Portfolio analytics for quants, written in Python

QuantStats: Portfolio analytics for quants QuantStats Python library that performs portfolio profiling, allowing quants and portfolio managers to unde

Ran Aroussi 2.7k Jan 08, 2023
Segmentation models with pretrained backbones. Keras and TensorFlow Keras.

Python library with Neural Networks for Image Segmentation based on Keras and TensorFlow. The main features of this library are: High level API (just

Pavel Yakubovskiy 4.2k Jan 09, 2023
WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose Yijun Zhou and James Gregson - BMVC2020 Abstract: We present an end-to-end head-pos

368 Dec 26, 2022
DeepFaceLab fork which provides IPython Notebook to use DFL with Google Colab

DFL-Colab — DeepFaceLab fork for Google Colab This project provides you IPython Notebook to use DeepFaceLab with Google Colaboratory. You can create y

779 Jan 05, 2023
A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval project page | arXiv | webvid-data Repository containing the code,

225 Dec 25, 2022
Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition How Fast Compare to Other Zero-Shot NAS Proxies on CIFAR-10/100 Pre-trained Model

190 Dec 29, 2022
Tree LSTM implementation in PyTorch

Tree-Structured Long Short-Term Memory Networks This is a PyTorch implementation of Tree-LSTM as described in the paper Improved Semantic Representati

Riddhiman Dasgupta 529 Dec 10, 2022
Dilated Convolution for Semantic Image Segmentation

Multi-Scale Context Aggregation by Dilated Convolutions Introduction Properties of dilated convolution are discussed in our ICLR 2016 conference paper

Fisher Yu 764 Dec 26, 2022
StyleSwin: Transformer-based GAN for High-resolution Image Generation

StyleSwin This repo is the official implementation of "StyleSwin: Transformer-based GAN for High-resolution Image Generation". By Bowen Zhang, Shuyang

Microsoft 349 Dec 28, 2022
UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring

UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring Code Summary aggregate.py: this script aggr

1 Dec 28, 2021
Convnet transfer - Code for paper How transferable are features in deep neural networks?

How transferable are features in deep neural networks? This repository contains source code necessary to reproduce the results presented in the follow

Jason Yosinski 143 Sep 13, 2022
Citation Intent Classification in scientific papers using the Scicite dataset an Pytorch

Citation Intent Classification Table of Contents About the Project Built With Installation Usage Acknowledgments About The Project Citation Intent Cla

Federico Nocentini 4 Mar 04, 2022
Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

Chetan Hirapara 3 Oct 07, 2022
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 01, 2023
A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

A graph neural network (GNN) model to predict protein-protein interactions (PPI) with no sample features

2 Jul 25, 2022
Spearmint Bayesian optimization codebase

Spearmint Spearmint is a software package to perform Bayesian optimization. The Software is designed to automatically run experiments (thus the code n

Formerly: Harvard Intelligent Probabilistic Systems Group -- Now at Princeton 1.5k Dec 29, 2022
Robust Consistent Video Depth Estimation

[CVPR 2021] Robust Consistent Video Depth Estimation This repository contains Python and C++ implementation of Robust Consistent Video Depth, as descr

Facebook Research 213 Dec 17, 2022
DCGAN LSGAN WGAN-GP DRAGAN PyTorch

Recommendation Our GAN based work for facial attribute editing - AttGAN. News 8 April 2019: We re-implement these GANs by Tensorflow 2! The old versio

Zhenliang He 408 Nov 30, 2022
Faster RCNN with PyTorch

Faster RCNN with PyTorch Note: I re-implemented faster rcnn in this project when I started learning PyTorch. Then I use PyTorch in all of my projects.

Long Chen 1.6k Dec 23, 2022