Official Implementation of Few-shot Visual Relationship Co-localization

Last update: Oct 13, 2022

Related tags

Deep Learning VRC

Overview

VRC

Official implementation of the Few-shot Visual Relationship Co-localization (ICCV 2021) paper

project page | paper

Requirements

Use python >= 3.8.5. Conda recommended : https://docs.anaconda.com/anaconda/install/linux/
Use pytorch 1.7.0 CUDA 10.2
Other requirements from 'requirements.txt'

To setup environment

# create new env vrc
$ conda create -n vrc python=3.8.5

# activate vrc
$ conda activate vrc

# install pytorch, torchvision
$ conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.2 -c pytorch

# install other dependencies
$ pip install -r requirements.txt

Training

Preparing dataset

Download VG images from https://visualgenome.org/
Extract faster_rcnn features of VG images using data_preparation/vrc_extract_frcnn_feats.py. Please follow instructions here.
Download VrR-VG dataset from http://vrr-vg.com/ or Google Drive Link

Training VR Encoder (VTransE)

Training parameters

To check and update training, model and dataset parameters see VR_Encoder/configs

To train VR Encoder:

$ python train_vr_encoder.py

Training VR Similarity Network (Relation Network)

Training parameters

To check and update training, testing, model and dataset parameters see VR_SimilarityNetwork/configs

To train VR Similarity Network:

$ python SimilarityNetworkTrain.py

To train VR Similarity Network (w/ concat as VR Encoding):

$ python ConcatplusSimilarityNetworkTrain.py

To evaluate (set eval setting in test_config.yaml)

$ python FullModelTest.py

Cite

If you find this code/paper useful for your research, please consider citing.

@InProceedings{teotiaMMM2021,
  author    = "Teotia, Revant and Mishra, Vaibhav and Maheshwari, Mayank and Mishra, Anand",
  title     = "Few-shot Visual Relationship Co-Localization",
  booktitle = "ICCV",
  year      = "2021",
}

Acknowledgements

This repo uses https://gitlab.com/meetshah1995/vqa-maskrcnn-benchmark and scripts from https://github.com/facebookresearch/mmf for Faster R-CNN feature extraction.

Code provided by https://github.com/zawlin/cvpr17_vtranse and https://github.com/yangxuntu/vrd helped in implementing VR encoder.

Contact

For any clarification, comment, or suggestion please create an issue or contact Revant, Vaibhav or Mayank.

Official Implementation of Few-shot Visual Relationship Co-localization

Related tags

Overview

VRC

Requirements

Training

Preparing dataset

Training VR Encoder (VTransE)

Training parameters

To train VR Encoder:

Training VR Similarity Network (Relation Network)

Training parameters

To train VR Similarity Network:

To train VR Similarity Network (w/ concat as VR Encoding):

To evaluate (set eval setting in test_config.yaml)

Cite

Acknowledgements

Contact

Owner

Python library to receive live stream events like comments and gifts in realtime from TikTok LIVE.

Code for CVPR2019 Towards Natural and Accurate Future Motion Prediction of Humans and Animals

Target Propagation via Regularized Inversion

Source code for our Paper "Learning in High-Dimensional Feature Spaces Using ANOVA-Based Matrix-Vector Multiplication"

Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

Continual World is a benchmark for continual reinforcement learning

Pytorch implementation of ICASSP 2022 paper Attention Probe: Vision Transformer Distillation in the Wild

Large scale PTM - PPI relation extraction

MLPs for Vision and Langauge Modeling (Coming Soon)

Source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

Training Confidence-Calibrated Classifier for Detecting Out-of-Distribution Samples / ICLR 2018

Python inverse kinematics for your robot model based on Pinocchio.

Neural Module Network for VQA in Pytorch

FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective

Alignment Attention Fusion framework for Few-Shot Object Detection

Model that predicts the probability of a Twitter user being anti-vaccination.

The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Intelligent Video Analytics toolkit based on different inference backends.

Self-supervised learning (SSL) is a method of machine learning