CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Last update: Dec 30, 2022

Related tags

Overview

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf]

The official repository for TransReID: Transformer-based Object Re-Identification achieves state-of-the-art performances on object re-ID, including person re-ID and vehicle re-ID.

News

2021.12 We improve TransReID via self-supervised pre-training. Please refer to TransReID-SSL
2021.3 We release the code of TransReID.

Pipeline

Abaltion Study of Transformer-based Strong Baseline

Requirements

Installation

pip install -r requirements.txt
(we use /torch 1.6.0 /torchvision 0.7.0 /timm 0.3.2 /cuda 10.1 / 16G or 32G V100 for training and evaluation.
Note that we use torch.cuda.amp to accelerate speed of training which requires pytorch >=1.6)

Prepare Datasets

mkdir data

Download the person datasets Market-1501, MSMT17, DukeMTMC-reID,Occluded-Duke, and the vehicle datasets VehicleID, VeRi-776, Then unzip them and rename them under the directory like

data
├── market1501
│   └── images ..
├── MSMT17
│   └── images ..
├── dukemtmcreid
│   └── images ..
├── Occluded_Duke
│   └── images ..
├── VehicleID_V1.0
│   └── images ..
└── VeRi
    └── images ..

Prepare DeiT or ViT Pre-trained Models

You need to download the ImageNet pretrained transformer model : ViT-Base, ViT-Small, DeiT-Small, DeiT-Base

Training

We utilize 1 GPU for training.

python train.py --config_file configs/transformer_base.yml MODEL.DEVICE_ID "('your device id')" MODEL.STRIDE_SIZE ${1} MODEL.SIE_CAMERA ${2} MODEL.SIE_VIEW ${3} MODEL.JPM ${4} MODEL.TRANSFORMER_TYPE ${5} OUTPUT_DIR ${OUTPUT_DIR} DATASETS.NAMES "('your dataset name')"

Arguments

${1}: stride size for pure transformer, e.g. [16, 16], [14, 14], [12, 12]
${2}: whether using SIE with camera, True or False.
${3}: whether using SIE with view, True or False.
${4}: whether using JPM, True or False.
${5}: choose transformer type from 'vit_base_patch16_224_TransReID',(The structure of the deit is the same as that of the vit, and only need to change the imagenet pretrained model) 'vit_small_patch16_224_TransReID','deit_small_patch16_224_TransReID',
${OUTPUT_DIR}: folder for saving logs and checkpoints, e.g. ../logs/market1501

or you can directly train with following yml and commands:

# DukeMTMC transformer-based baseline
python train.py --config_file configs/DukeMTMC/vit_base.yml MODEL.DEVICE_ID "('0')"
# DukeMTMC baseline + JPM
python train.py --config_file configs/DukeMTMC/vit_jpm.yml MODEL.DEVICE_ID "('0')"
# DukeMTMC baseline + SIE
python train.py --config_file configs/DukeMTMC/vit_sie.yml MODEL.DEVICE_ID "('0')"
# DukeMTMC TransReID (baseline + SIE + JPM)
python train.py --config_file configs/DukeMTMC/vit_transreid.yml MODEL.DEVICE_ID "('0')"
# DukeMTMC TransReID with stride size [12, 12]
python train.py --config_file configs/DukeMTMC/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')"

# MSMT17
python train.py --config_file configs/MSMT17/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')"
# OCC_Duke
python train.py --config_file configs/OCC_Duke/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')"
# Market
python train.py --config_file configs/Market/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')"
# VeRi
python train.py --config_file configs/VeRi/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')"

# VehicleID (The dataset is large and we utilize 4 v100 GPUs for training )
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port 66666 train.py --config_file configs/VehicleID/vit_transreid_stride.yml MODEL.DIST_TRAIN True
#  or using following commands:
Bash dist_train.sh

Tips: For person datasets with size 256x128, TransReID with stride occupies 12GB GPU memory and TransReID occupies 7GB GPU memory.

Evaluation

python test.py --config_file 'choose which config to test' MODEL.DEVICE_ID "('your device id')" TEST.WEIGHT "('your path of trained checkpoints')"

Some examples:

# DukeMTMC
python test.py --config_file configs/DukeMTMC/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')"  TEST.WEIGHT '../logs/duke_vit_transreid_stride/transformer_120.pth'
# MSMT17
python test.py --config_file configs/MSMT17/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')" TEST.WEIGHT '../logs/msmt17_vit_transreid_stride/transformer_120.pth'
# OCC_Duke
python test.py --config_file configs/OCC_Duke/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')" TEST.WEIGHT '../logs/occ_duke_vit_transreid_stride/transformer_120.pth'
# Market
python test.py --config_file configs/Market/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')"  TEST.WEIGHT '../logs/market_vit_transreid_stride/transformer_120.pth'
# VeRi
python test.py --config_file configs/VeRi/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')" TEST.WEIGHT '../logs/veri_vit_transreid_stride/transformer_120.pth'

# VehicleID (We test 10 times and get the final average score to avoid randomness)
python test.py --config_file configs/VehicleID/vit_transreid_stride.yml MODEL.DEVICE_ID "('0')" TEST.WEIGHT '../logs/vehicleID_vit_transreid_stride/transformer_120.pth'

Trained Models and logs (Size 256)

Datasets	MSMT17	Market	Duke	OCC_Duke	VeRi	VehicleID
Model	mAP \| R1	mAP \| R1	mAP \| R1	mAP \| R1	mAP \| R1	R1 \| R5
Baseline(ViT)	61.8 \| 81.8	87.1 \| 94.6	79.6 \| 89.0	53.8 \| 61.1	79.0 \| 96.6	83.5 \| 96.7
Baseline(ViT)	model \| log	model \| log	model \| log	model \| log	model \| log	model \| test
*TransReID^(ViT)**	67.8 \| 85.3	89.0 \| 95.1	82.2 \| 90.7	59.5 \| 67.4	82.1 \| 97.4	85.2 \| 97.4
*TransReID^(ViT)**	model \| log	model \| log	model \| log	model \| log	model \| log	model \| test
*TransReID^(DeiT)**	66.3 \| 84.0	88.5 \| 95.1	81.9 \| 90.7	57.7 \| 65.2	82.4 \| 97.1	86.0 \| 97.6
*TransReID^(DeiT)**	model \| log	model \| log	model \| log	model \| log	model \| log	model \| test

Note: We reorganize code and the performances are slightly different from the paper's.

Acknowledgement

Codebase from reid-strong-baseline , pytorch-image-models

We import veri776 viewpoint label from repo: https://github.com/Zhongdao/VehicleReIDKeyPointData

Citation

If you find this code useful for your research, please cite our paper

@InProceedings{He_2021_ICCV,
    author    = {He, Shuting and Luo, Hao and Wang, Pichao and Wang, Fan and Li, Hao and Jiang, Wei},
    title     = {TransReID: Transformer-Based Object Re-Identification},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {15013-15022}
}

Contact

If you have any question, please feel free to contact us. E-mail: [email protected] , [email protected]

Datasets	MSMT17	Market	Duke	OCC_Duke	VeRi	VehicleID
Model	mAP \| R1	mAP \| R1	mAP \| R1	mAP \| R1	mAP \| R1	R1 \| R5
Baseline(ViT)	61.8 \| 81.8	87.1 \| 94.6	79.6 \| 89.0	53.8 \| 61.1	79.0 \| 96.6	83.5 \| 96.7
Baseline(ViT)	model \| log	model \| log	model \| log	model \| log	model \| log	model \| test
*TransReID^(ViT)**	67.8 \| 85.3	89.0 \| 95.1	82.2 \| 90.7	59.5 \| 67.4	82.1 \| 97.4	85.2 \| 97.4
*TransReID^(ViT)**	model \| log	model \| log	model \| log	model \| log	model \| log	model \| test
*TransReID^(DeiT)**	66.3 \| 84.0	88.5 \| 95.1	81.9 \| 90.7	57.7 \| 65.2	82.4 \| 97.1	86.0 \| 97.6
*TransReID^(DeiT)**	model \| log	model \| log	model \| log	model \| log	model \| log	model \| test

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Related tags

Overview

[ICCV2021] TransReID: Transformer-based Object Re-Identification [pdf]

News

Pipeline

Abaltion Study of Transformer-based Strong Baseline

Requirements

Installation

Prepare Datasets

Prepare DeiT or ViT Pre-trained Models

Training

Arguments

Evaluation

Trained Models and logs (Size 256)

Acknowledgement

Citation

Contact

Owner

DamoCV

Official implementation of NeurIPS 2021 paper "Contextual Similarity Aggregation with Self-attention for Visual Re-ranking"

Official code for 'Weakly-supervised Video Anomaly Detection with Robust Temporal Feature Magnitude Learning' [ICCV 2021]

Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"

Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

Python 3 module to print out long strings of text with intervals of time inbetween

Pytorch implementation of "A simple neural network module for relational reasoning" (Relational Networks)

This repository provides the code for MedViLL(Medical Vision Language Learner).

Hierarchical Time Series Forecasting with a familiar API

Multi-Scale Progressive Fusion Network for Single Image Deraining

Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.

Rethinking Nearest Neighbors for Visual Classification

Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

Official Repository for the paper "Improving Baselines in the Wild".

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

YKKDetector For Python

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

BERTMap: A BERT-Based Ontology Alignment System

Code for the paper Learning the Predictability of the Future