Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Overview

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning

Tensorflow code and models for the paper:

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie
CVPR 2018

This repository contains code and pre-trained models used in the paper and 2 demos to demonstrate: 1) the importance of pre-training data on transfer learning; 2) how to calculate domain similarity between source domain and target domain.

Notice that we used a mini validation set (./inat_minival.txt) contains 9,697 images that are randomly selected from the original iNaturalist 2017 validation set. The rest of valdiation images were combined with the original training set to train our model in the paper. There are 665,473 training images in total.

Dependencies:

Preparation:

  • Clone the repo with recursive:
git clone --recursive https://github.com/richardaecn/cvpr18-inaturalist-transfer.git
  • Install dependencies. Please refer to TensorFlow, pyemd, scikit-learn and scikit-image official websites for installation guide.
  • Download data and feature and unzip them into the same directory as the cloned repo. You should have two folders './data' and './feature' in the repo's directory.

Datasets (optional):

In the paper, we used data from 9 publicly available datasets:

We provide a download link that includes the entire CUB-200-2011 dataset and data splits for the rest of 8 datasets. The provided link contains sufficient data for this repo. If you would like to use other 8 datasets, please download them from the official websites and put them in the corresponding subfolders under './data'.

Pre-trained Models (optional):

The models were trained using TensorFlow-Slim. We implemented Squeeze-and-Excitation Networks (SENet) under './slim'. The pre-trained models can be downloaded from the following links:

Network Pre-trained Data Input Size Download Link
Inception-V3 ImageNet 299 link
Inception-V3 iNat2017 299 link
Inception-V3 iNat2017 448 link
Inception-V3 iNat2017 299 -> 560 FT1 link
Inception-V3 ImageNet + iNat2017 299 link
Inception-V3 SE ImageNet + iNat2017 299 link
Inception-V4 iNat2017 448 link
Inception-V4 iNat2017 448 -> 560 FT2 link
Inception-ResNet-V2 ImageNet + iNat2017 299 link
Inception-ResNet-V2 SE ImageNet + iNat2017 299 link
ResNet-V2 50 ImageNet + iNat2017 299 link
ResNet-V2 101 ImageNet + iNat2017 299 link
ResNet-V2 152 ImageNet + iNat2017 299 link

1 This model was trained with 299 input size on train + 90% val and then fine-tuned with 560 input size on 90% val.

2 This model was trained with 448 input size on train + 90% val and then fine-tuned with 560 input size on 90% val.

TensorFlow Hub also provides a pre-trained Inception-V3 299 on iNat2017 original training set here.

Featrue Extraction (optional):

Run the following Python script to extract feature:

python feature_extraction.py

To run this script, you need to download the checkpoint of Inception-V3 299 trained on iNat2017. The dataset and pre-trained model can be modified in the script.

We provide a download link that includes features used in the domos of this repo.

Demos

  1. Linear logistic regression on extracted features:

This demo shows the importance of pre-training data on transfer learning. Based on features extracted from an Inception-V3 pre-trained on iNat2017, we are able to achieve 89.9% classification accuracy on CUB-200-2011 with the simple logistic regression, outperforming most state-of-the-art methods.

LinearClassifierDemo.ipynb
  1. Calculating domain similarity by Earth Mover's Distance (EMD): This demo gives an example to calculate the domain similarity proposed in the paper. Results correspond to part of the Fig. 5 in the original paper.
DomainSimilarityDemo.ipynb

Training and Evaluation

  • Convert dataset into '.tfrecord':
python convert_dataset.py --dataset_name=cub_200 --num_shards=10
  • Train (fine-tune) the model on 1 GPU:
CUDA_VISIBLE_DEVICES=0 ./train.sh
  • Evaluate the model on another GPU simultaneously:
CUDA_VISIBLE_DEVICES=1 ./eval.sh
  • Run Tensorboard for visualization:
tensorboard --logdir=./checkpoints/cub_200/ --port=6006

Citation

If you find our work helpful in your research, please cite it as:

@inproceedings{Cui2018iNatTransfer,
  title = {Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning},
  author = {Yin Cui, Yang Song, Chen Sun, Andrew Howard, Serge Belongie},
  booktitle={CVPR},
  year={2018}
}
Owner
Yin Cui
Research Scientist at Google
Yin Cui
A fast Protein Chain / Ligand Extractor and organizer.

Are you tired of using visualization software, or full blown suites just to separate protein chains / ligands ? Are you tired of organizing the mess o

Amine Abdz 9 Nov 06, 2022
The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

ISC21-Descriptor-Track-1st The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track. You can check our solution

lyakaap 75 Jan 08, 2023
Syllabus del curso IIC2115 - Programación como Herramienta para la Ingeniería 2022/I

IIC2115 - Programación como Herramienta para la Ingeniería Videos y tutoriales Tutorial CMD Tutorial Instalación Python y Jupyter Tutorial de git-GitH

21 Nov 09, 2022
A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Logical Neural Networks LNNs are a novel Neuro = symbolic framework designed to seamlessly provide key properties of both neural nets (learning) and s

International Business Machines 138 Dec 19, 2022
Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching Official pytorch implementation of "Show, Attend and Distill: Kn

Clova AI Research 80 Dec 16, 2022
Jittor implementation of PCT:Point Cloud Transformer

PCT: Point Cloud Transformer This is a Jittor implementation of PCT: Point Cloud Transformer.

MenghaoGuo 547 Jan 03, 2023
Facilitates implementing deep neural-network backbones, data augmentations

Introduction Nowadays, the training of Deep Learning models is fragmented and unified. When AI engineers face up with one specific task, the common wa

40 Dec 29, 2022
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica

Meftun AKARSU 52 Dec 22, 2022
Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"

Manga Character Screentone Synthesis Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters" presented in IEEE ISM 2

Tsubota 2 Nov 20, 2021
Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

aft-pytorch Unofficial PyTorch implementation of Attention Free Transformer's layers by Zhai, et al. [abs, pdf] from Apple Inc. Installation You can i

Rishabh Anand 184 Dec 12, 2022
rliable is an open-source Python library for reliable evaluation, even with a handful of runs, on reinforcement learning and machine learnings benchmarks.

Open-source library for reliable evaluation on reinforcement learning and machine learning benchmarks. See NeurIPS 2021 oral for details.

Google Research 529 Jan 01, 2023
YOLO-v5 기반 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adaptive Cruise Control 기능 구현

자율 주행차의 영상 기반 차간거리 유지 개발 Table of Contents 프로젝트 소개 주요 기능 시스템 구조 디렉토리 구조 결과 실행 방법 참조 팀원 프로젝트 소개 YOLO-v5 기반으로 단안 카메라의 영상을 활용해 차간 거리를 일정하게 유지하며 주행하는 Adap

14 Jun 29, 2022
Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation

Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation This is the implementation of the approach describ

Taosha Fan 47 Nov 15, 2022
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis 🙈 A more detailed re

Lincedo Lab 4 Jun 09, 2021
《Lerning n Intrinsic Grment Spce for Interctive Authoring of Grment Animtion》

Learning an Intrinsic Garment Space for Interactive Authoring of Garment Animation Overview This is the demo code for training a motion invariant enco

YuanBo 213 Dec 14, 2022
The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

MultiModal-Collaborative (MMC) Learning Framework for integrating RGB and Thermal spectral modalities This is the official code for NeurIPS 2021 Machi

NeurAI 12 Nov 02, 2022
This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

DendroMap DendroMap is an interactive tool to explore large-scale image datasets used for machine learning. A deep understanding of your data can be v

DIV Lab 33 Dec 30, 2022
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

In-Place Activated BatchNorm In-Place Activated BatchNorm for Memory-Optimized Training of DNNs In-Place Activated BatchNorm (InPlace-ABN) is a novel

1.3k Dec 29, 2022
A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Imbalanced Dataset Sampler Introduction In many machine learning applications, we often come across datasets where some types of data may be seen more

Ming 2k Jan 08, 2023
A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

#360Diffusion automatically upscales your CLIP Guided Diffusion outputs using Real-ESRGAN. Latest Update: Alpha 1.61 [Main Branch] - 01/11/22 Layout a

78 Nov 02, 2022