Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Last update: Dec 12, 2022

Related tags

Deep Learning AAAI2022-IEEE-for-MMReID

Overview

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification

We provide the codes for reproducing result of our paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Installation

Basic environments: python3.6, pytorch1.8.0, cuda11.1.
Our codes structure is based on Torchreid. (More details can be found in link: https://github.com/KaiyangZhou/deep-person-reid , you can download the packages according to Torchreid requirements.)

# create environment
cd AAAI2022_IEEE/
conda create --name ieeeReid python=3.6
conda activate ieeeReid

# install dependencies
# make sure `which python` and `which pip` point to the correct path
pip install -r requirements.txt

# install torch and torchvision (select the proper cuda version to suit your machine)
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge

# install torchreid (don't need to re-build it if you modify the source code)
python setup.py develop

Get start

You can use the setting in im_r50_softmax_256x128_amsgrad_RGBNT_ieee_part_margin.yaml to get the results of full IEEE.

python ./scripts/mainMultiModal.py --config-file ./configs/im_r50_softmax_256x128_amsgrad_RGBNT_ieee_part_margin.yaml --seed 40

You can run other methods by using following configuration file:

# MLFN
./configs/im_r50_softmax_256x128_amsgrad_RGBNT_mlfn.yaml

# HACNN
./configs/im_r50_softmax_256x128_amsgrad_RGBNT_hacnn.yaml

# OSNet
./configs/im_r50_softmax_256x128_amsgrad_RGBNT_osnet.yaml

# HAMNet
./configs/im_r50_softmax_256x128_amsgrad_RGBNT_hamnet.yaml

# PFNet
./configs/im_r50_softmax_256x128_amsgrad_RGBNT_hamnet.yaml

# full IEEE
./configs/im_r50_softmax_256x128_amsgrad_RGBNT_ieee_part_margin.yaml

Details

The details of our Cross-modal Interacting Module (CIM) and Relation-based Embedding Module (REM) can be found in .\torchreid\models\ieee3modalPart.py. The design of Multi-modal Margin Loss(3M loss) can be found in .\torchreid\losses\multi_modal_margin_loss_new.py.

Ablation study settings.

You can control these two modules and the loss by change the corresponding codes.

Cross-modal Interacting Module (CIM) and Relation-based Embedding Module (REM)

# change the code in .\torchreid\models\ieee3modalPart.py

class IEEE3modalPart(nn.Module):
    def __init__(···
    ):
        modal_number = 3
        fc_dims = [128]
        pooling_dims = 768
        super(IEEE3modalPart, self).__init__()
        self.loss = loss
        self.parts = 6
        
        self.backbone = nn.ModuleList(···
        )
		
		  # using Cross-modal Interacting Module (CIM)
        self.interaction = True
        # using channel attention in CIM
        self.attention = True
        
        # using Relation-based Embedding Module (REM)
        self.using_REM = True
        
        ···

Multi-modal Margin Loss(3M loss)

# change the code in .\configs\your_config_file.yaml

# using Multi-modal Margin Loss(3M loss), you can change the margin by modify the parameter of "ieee_margin".
···
loss:
  name: 'margin'
  softmax:
    label_smooth: True
  ieee_margin: 1
  weight_m: 1.0
  weight_x: 1.0
···

# using only CE loss
···
loss:
  name: 'softmax'
  softmax:
    label_smooth: True
  weight_x: 1.0
···

Code of paper Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification.

Related tags

Overview

Interact, Embed, and EnlargE (IEEE): Boosting Modality-specific Representations for Multi-Modal Person Re-identification

Installation

Get start

Details

Owner

Python program that works as a contact list

The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

A collection of awesome resources image-to-image translation.

UIUCTF 2021 Public Challenge Repository

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

Controlling the MicriSpotAI robot from scratch

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

The best solution of the Weather Prediction track in the Yandex Shifts challenge

[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Multi-Scale Geometric Consistency Guided Multi-View Stereo

PyTorch implementation of our CVPR2021 (oral) paper "Prototype Augmentation and Self-Supervision for Incremental Learning"

Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

The Environment I built to study Reinforcement Learning + Pokemon Showdown

MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"