Experiment about Deep Person Re-identification with EfficientNet-v2

Last update: Jan 03, 2023

Overview

deep-efficient-person-reid

Experiment for an uni project with strong baseline for Person Re-identification task.

We evaluated the baseline with Resnet50 and Efficienet-v2 without using pretrained models. Also Resnet50-IBN-A and Efficientnet-v2 using pretrained on ImageNet. We used two datasets: Market-1501 and CUHK03.

Pipeline

Implementation Details

Random Erasing to transform input images.
EfficientNet-v2 / Resnet50 / Resnet50-IBN-A as backbone.
Stride = 1 for last convolution layer. Embedding size for Resnet50 / Resnet50-IBN-A is 2048, while for EfficientNet-v2 is 1280. During inference, embedding features will run through a batch norm layer, as known as a bottleneck for better normalization.

Loss function combining 3 losses:
1. Triplet Loss with Hard Example Mining.
2. Classification Loss (Cross Entropy) with Label Smoothing.
3. Centroid Loss - Center Loss for reducing the distance of embeddings to its class center. When combining it with Classification Loss, it helps preventing embeddings from collapsing.

The default optimizer is AMSgrad with base learning rate of 3.5e^-4 and multistep learning rate scheduler, decayed at epoch 30^th and epoch 55^th. Besides, we also apply mixed precision in training.
In both datasets, pretrained models were trained for 60 epochs and non-pretrained models were trained for 100 epochs.

Source Structure

.
├── config                  # hyperparameters settings
│   └── ...                 # yaml files
├
├── datasets                # data loader
│   └── ...           
├
├── market1501              # market-1501 dataset
|
├── cuhk03_release          # cuhk03 dataset
|
├── samplers                # random samplers
│   └── ...
|
├── loggers                 # test weights and visualization results      
|   └── runs
|   
├── losses                  # loss functions
│   └── ...   
|
├── nets                    # models
│   └── bacbones            
│       └── ... 
│   
├── engine                  # training and testing procedures
│   └── ...    
|
├── metrics                 # mAP and re-ranking
│   └── ...   
|
├── utils                   # wrapper and util functions 
│   └── ...
|
├── train.py                # train code 
|
├── test.py                 # test code 
|
├── visualize.py            # visualize results

Pretrained Models (on ImageNet)

EfficientNet-v2: link
Resnet50-IBN-A: link

Notebook

Notebook to train, inference and visualize:

Setup

Download datasets: Market-1501 and CUHK03

Install dependencies, change directory to dertorch:

pip install -r requirements.txt
cd dertorch/

Modify config files in /configs/. You can play with the parameters for better training, testing.

Training:

python train.py --config_file=name_of_config_file
Ex: python train.py --config_file=efficientnetv2_market

Testing: Save in /loggers/runs, for example the result from EfficientNet-v2 (Market-1501): link

python test.py --config_file=name_of_config_file
Ex: python test.py --config_file=efficientnetv2_market

Visualization: Save in /loggers/runs/results/, for example the result from EfficienNet-v2 (Market-1501): link

python visualize.py --config_file=name_of_config_file
Ex: python visualize.py --config_file=efficientnetv2_market

Examples

Query image 1

Result image 1

Query image 2

Result image 2

Results

Market-1501

Models	Image Size	mAP	Rank-1	Rank-5	Rank-10	weights
Resnet50 (non-pretrained)	256x128	51.8	74.0	88.2	93.0	link
EfficientNet-v2 (non-pretrained)	256x128	56.5	78.5	91.1	94.4	link
Resnet50-IBN-A	256x128	77.1	90.7	97.0	98.4	link
EfficientNet-v2	256x128	69.7	87.1	95.3	97.2	link
Resnet50-IBN-A + Re-ranking	256x128	89.8	92.1	96.5	97.7	link
EfficientNet-v2 + Re-ranking	256x128	85.6	89.9	94.7	96.2	link

CUHK03:

Models	Image Size	mAP	Rank-1	Rank-5	Rank-10	weights
Resnet50 (non-pretrained)	...	...	...	...	...	...
EfficientNet-v2 (non-pretrained)	256x128	10.1	10.1	21.1	29.5	link
Resnet50-IBN-A	256x128	41.2	41.8	63.1	71.2	link
EfficientNet-v2	256x128	40.6	42.9	63.1	72.5	link
Resnet50-IBN-A + Re-ranking	256x128	55.6	51.2	64.0	72.0	link
EfficientNet-v2 + Re-ranking	256x128	56.0	51.4	64.7	73.4	link

The results from EfficientNet-v2 models might be better if fine-tuning properly and longer training epochs, while here we use the best parameters for the ResNet models (on Market-1501 dataset) from this paper and only trained for 60 - 100 epochs.

Citation

@article{DBLP:journals/corr/abs-2104-13643,
  author    = {Mikolaj Wieczorek and
               Barbara Rychalska and
               Jacek Dabrowski},
  title     = {On the Unreasonable Effectiveness of Centroids in Image Retrieval},
  journal   = {CoRR},
  volume    = {abs/2104.13643},
  year      = {2021},
  url       = {https://arxiv.org/abs/2104.13643},
  archivePrefix = {arXiv},
  eprint    = {2104.13643},
  timestamp = {Tue, 04 May 2021 15:12:43 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2104-13643.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

@InProceedings{Luo_2019_CVPR_Workshops,
author = {Luo, Hao and Gu, Youzhi and Liao, Xingyu and Lai, Shenqi and Jiang, Wei},
title = {Bag of Tricks and a Strong Baseline for Deep Person Re-Identification},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2019}
}

Adapted from: michuanhaohao

Experiment about Deep Person Re-identification with EfficientNet-v2

Related tags

Overview

deep-efficient-person-reid

Pipeline

Implementation Details

Source Structure

Pretrained Models (on ImageNet)

Notebook

Setup

Examples

Results

Citation

Owner

lan.nguyen2k

DWIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data.

Causal estimators for use with WhyNot

Kaggle Ultrasound Nerve Segmentation competition [Keras]

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Data-depth-inference - Data depth inference with python

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

MRI reconstruction (e.g., QSM) using deep learning methods

Accelerating BERT Inference for Sequence Labeling via Early-Exit

Official implementation of the ICCV 2021 paper: "The Power of Points for Modeling Humans in Clothing".

Fast, flexible and fun neural networks.

Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Pure python implementations of popular ML algorithms.

Implementation of UNET architecture for Image Segmentation.

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

An intelligent, flexible grammar of machine learning.

Create Data & AI apps in 20 lines of code with Shimoku

PyTorch implementation of the paper Ultra Fast Structure-aware Deep Lane Detection

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

Semantic Segmentation in Pytorch. Network include: FCN、FCN_ResNet、SegNet、UNet、BiSeNet、BiSeNetV2、PSPNet、DeepLabv3_plus、 HRNet、DDRNet

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.