🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Last update: Dec 29, 2022

Related tags

Deep Learning AIC2021-T5-CLV

Overview

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

We have two codebases. For the final submission, we conduct the feature ensemble, where features are from two codebases.

Part One is at here: https://github.com/ShuaiBai623/AIC2021-T5-CLV

Part Two is at here: https://github.com/layumi/NLP-AICity2021

Prepare

Preprocess the dataset to prepare frames, motion maps, NLP augmentation

scripts/extract_vdo_frms.py is a Python script that is used to extract frames.

scripts/get_motion_maps.py is a Python script that is used to get motion maps.

scripts/deal_nlpaug.py is a Python script that is used for NLP augmentation.

Download the pretrained models of Part One to checkpoints. The checkpoints can be found here. The best score of a single model on TestA is 0.1927 from motion_effb3_NOCLS_nlpaug_320.pth.

The directory structures in data and checkpoints are as follows：

.
├── checkpoints
│   ├── motion_effb2_1CLS_nlpaug_288.pth
│   ├── motion_effb3_NOCLS_nlpaug_320.pth
│   ├── motion_SE_3CLS_nonlpaug_288.pth
│   ├── motion_SE_NOCLS_nlpaug_288.pth
│   └── motion_SE_NOCLS_nonlpaug_288.pth
└── data
    ├── AIC21_Track5_NL_Retrieval
    │   ├── train
    │   └── validation
    ├── motion_map 
    ├── test-queries.json
    ├── test-queries_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
    ├── test-tracks.json
    ├── train.json
    ├── train_nlpaug.json
    ├── train-tracks.json
    ├── train-tracks_nlpaug.json    ## NLP augmentation (Refer to scripts/deal_nlpaug.py)
    ├── val.json
    └── val_nlpaug.json             ## NLP augmentation (Refer to scripts/deal_nlpaug.py)

Part One

Modify the data paths in config.py

Train

The configuration files are in configs.

CUDA_VISIBLE_DEVICES=0,1,2,3 python -u main.py --name your_experiment_name --config your_config_file |tee log

Test

Change the RESTORE_FROM in your configuration file.

python -u test.py --config your_config_file

Extract the visual and text embeddings. The extracted embeddings can be found here.

python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nlpaug_288.yaml
python -u test.py --config configs/motion_effb2_1CLS_nlpaug_288.yaml
python -u test.py --config configs/motion_SE_3CLS_nonlpaug_288.yaml
python -u test.py --config configs/motion_SE_NOCLS_nonlpaug_288.yaml

Part Two

Link

Submission

During the inference, we average all the frame features of the target in each track as track features, the embeddings of text descriptions are also averaged as the query features. The cosine distance is used for ranking as the final result.

Reproduce the best submission. ALL extracted embeddings are in the folder output:

python scripts/get_submit.py

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Related tags

Overview

AI City 2021: Connecting Language and Vision for Natural Language-Based Vehicle Retrieval

Prepare

Part One

Train

Test

Part Two

Submission

Friend Links：

Owner

StyleGAN - Official TensorFlow Implementation

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

Submission to Twitter's algorithmic bias bounty challenge

Codes for [NeurIPS'21] You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership.

Source code for paper: Knowledge Inheritance for Pre-trained Language Models

Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

This repo holds codes of the ICCV21 paper: Visual Alignment Constraint for Continuous Sign Language Recognition.

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

Automatically creates genre collections for your Plex media

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

Node Editor Plug for Blender

This is the official implementation code repository of Underwater Light Field Retention : Neural Rendering for Underwater Imaging (Accepted by CVPR Workshop2022 NTIRE)

Minimalistic PyTorch training loop

[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

A python module for scientific analysis of 3D objects based on VTK and Numpy

Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

AbelNN: Deep Learning Python module from scratch

Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

FindFunc is an IDA PRO plugin to find code functions that contain a certain assembly or byte pattern, reference a certain name or string, or conform to various other constraints.