Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Last update: Jan 04, 2023

Overview

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

For more information, check out the paper on [arXiv].

Training with different backbones and evaluations of them are to be updated soon..

Check out our new paper! [arXiv]

Network

Our model CATs is illustrated below:

Environment Settings

git clone https://github.com/SunghwanHong/CATs
cd CATs

conda create -n CATs python=3.6
conda activate CATs

pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install -U scikit-image
pip install git+https://github.com/albumentations-team/albumentations
pip install tensorboardX termcolor timm tqdm requests pandas

Evaluation

Download pre-trained weights on Link
All datasets are automatically downloaded into directory specified by argument datapath

Result on SPair-71k: (PCK 49.9%)

  python test.py --pretrained "/path_to_pretrained_model/spair" --benchmark spair

Result on SPair-71k, feature backbone frozen: (PCK 42.4%)

  python test.py --pretrained "/path_to_pretrained_model/spair_frozen" --benchmark spair

Results on PF-PASCAL: (PCK 75.4%, 92.6%, 96.4%)

  python test.py --pretrained "/path_to_pretrained_model/pfpascal" --benchmark pfpascal

Results on PF-PACAL, feature backbone frozen: (PCK 67.5%, 89.1%, 94.9%)

  python test.py --pretrained "/path_to_pretrained_model/pfpascal_frozen" --benchmark pfpascal

Acknowledgement

We borrow code from public projects (huge thanks to all the projects). We mainly borrow code from DHPF and GLU-Net.

BibTeX

If you find this research useful, please consider citing:

@inproceedings{cho2021cats,
  title={CATs: Cost Aggregation Transformers for Visual Correspondence},
  author={Cho, Seokju and Hong, Sunghwan and Jeon, Sangryul and Lee, Yunsung and Sohn, Kwanghoon and Kim, Seungryong},
  booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
  year={2021}
}

Official implementation of CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Related tags

Overview

CATs: Cost Aggregation Transformers for Visual Correspondence NeurIPS'21

Network

Environment Settings

Evaluation

Acknowledgement

BibTeX

Owner

Sunghwan Hong

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

Code artifacts for the submission "Mind the Gap! A Study on the Transferability of Virtual vs Physical-world Testing of Autonomous Driving Systems"

Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

SciPy fixes and extensions

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency"

A python library for highly configurable transformers - easing model architecture search and experimentation.

Latent Execution for Neural Program Synthesis

Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Pure python implementation reverse-mode automatic differentiation

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021]

Attention for PyTorch with Linear Memory Footprint

Companion repo of the UCC 2021 paper "Predictive Auto-scaling with OpenStack Monasca"

The source code for 'Noisy-Labeled NER with Confidence Estimation' accepted by NAACL 2021

A very impractical 3D rendering engine that runs in the python terminal.

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Embracing Single Stride 3D Object Detector with Sparse Transformer

Fully Automatic Page Turning on Real Scores

Answer a series of contextually-dependent questions like they may occur in natural human-to-human conversations.

YOLOX + ROS(1, 2) object detection package