TrTr: Visual Tracking with Transformer

Related tags

Deep LearningTrTr
Overview

TrTr: Visual Tracking with Transformer

We propose a novel tracker network based on a powerful attention mechanism called Transformer encoder-decoder architecture to gain global and rich contextual interdependencies. In this new architecture, features of the template image is processed by a self-attention module in the encoder part to learn strong context information, which is then sent to the decoder part to compute cross-attention with the search image features processed by another self-attention module. In addition, we design the classification and regression heads using the output of Transformer to localize target based on shape-agnostic anchor. We extensively evaluate our tracker TrTr, on several benchmarks and our method performs favorably against state-of-the-art algorithms.

Network architecture of TrTr for visual tracking

Installation

Install dependencies

$ ./install.sh ~/anaconda3 trtr 

note1: suppose you have the anaconda installation path under ~/anaconda3.

note2: please select a proper cuda-toolkit version to install Pytorch from conda, the default is 10.1. However, for RTX3090, please select 11.0. Then the above installation command would be $ ./install.sh ~/anaconda3 trtr 11.0.

Activate conda environment

$ conda activate trtr

Quick Start: Using TrTr

Webcam demo

Offline Model

$ python demo.py --tracker.checkpoint networks/trtr_resnet50.pth --use_baseline_tracker

Online Model

$ python demo.py --tracker.checkpoint networks/trtr_resnet50.pth

image sequences (png, jpeg)

add option --video_name ${video_dir}

video (mp4 or avi)

add option --video_name ${video_name}

Benchmarks

Download testing datasets

Please read this README.md to prepare the dataset.

Basic usage

Test tracker

$ cd benchmark
$ python test.py --cfg_file ../parameters/experiment/vot2018/offline.yaml
  • --cfg_file: the yaml file containing the hyper-parameter for each datasets. Please check ./benchmark/parameters/experiment for more yaml files
    • online model for VOT2018: python test.py --cfg_file ../parameters/experiment/vot2018/online.yaml
    • online model for OTB: python test.py --cfg_file ../parameters/experiment/otb/online.yaml
  • --result_path: optional parameter to specify a directory to store the tracking result. Default value is results, which generate ./benchmark/results/${dataset_name}
  • --model_name: optional parameter to specify the name of tracker name under the result path. Default value is trtr, which yield a tracker directory of ./benchmark/results/${dataset_name}/trtr
  • --vis: visualize tracking
  • --repetition: repeat number. For example, you should assign --repetition 15 for VOT benchmark following the official evaluation.

Eval tracker

$ cd benchmark
$ python eval.py
  • --dataset: parameter to specify the benchmark. Default value is VOT2018. Please assign other bench name, e.g., OTB, VOT2019, UAV, etc.
  • --tracker_path: parameter to specify the result directory. Default value is ./benchmark/results. This is a parameter related to --result_path parameter in python test.py.
  • --num: parameter to specify the thread number for evaluation multiple tracker results. Default is 1.

(Option) Hyper-parameter search

$ python hp_search.py --tracker.checkpoint ../networks/trtr_resnet50.pth --tracker.search_sizes 280 --separate --repetition 1  --use_baseline_tracker --tracker.model.transformer_mask True

Train

Download training datasets

Please read this README.md to prepare the training dataset.

Download VOT2018 dataset

  1. Please download VOT2018 dataset following [this REAMDE], which is necessary for testing the model during training.
  2. Or you skip this testing process by assigning several parameter, which are explained later.

Test with single GPU

$ python main.py  --cfg_file ./parameters/train/default.yaml --output_dir train

note1: please check ./parameters/train/default.yaml for the parameters for training note2: --output_dir to assign the path to store the training result. The above commmand genearte ./train note3: maybe you have to modify the file limit: ulimit -n 8192. Write in ~/.bashrc maybe better. note4: you can a larger value for --benchmark_start_epoch than for --epochs to skip benchmark test. e.g., --benchmark_start_epoch 21 and --epochs 20

debug mode for quick checking the training process:

$ python main.py  --cfg_file ./parameters/train/default.yaml  --batch_size 16 --dataset.paths ./datasets/yt_bb/dataset/Curation  ./datasets/vid/dataset/Curation/ --dataset.video_frame_ranges 3 100  --dataset.num_uses 100 100  --dataset.eval_num_uses 100 100  --resume networks/trtr_resnet50.pth --benchmark_start_epoch 0 --epochs 10

Multi GPUs

multi GPUs in single machine

$ python -m torch.distributed.launch --nproc_per_node=2 --use_env main.py --cfg_file ./parameters/train/default.yaml --output_dir train

--nproc_per_node: is the number of GPU to use. The above command means use two GPUs in a machine.

multi GPUs in multi machines

Master Machine

$ python -m torch.distributed.launch --nproc_per_node=2 --nnodes=2 --node_rank=0 --master_addr="${MASTER_IP_ADDRESS}" --master_port=${port} --use_env main.py --cfg_file ./parameters/train/default.yaml --output_dir train  --benchmark_start_epoch 8
  • --nnodes: number of machine to use. The above command means two machines.
  • --node_rank: the id for each machine. Master should be 0.
  • master_addr: assign the IP address of master machine
  • master_port: open port (e.g., 8080)

Slave1 Machine

$ python -m torch.distributed.launch --nproc_per_node=2 --nnodes=2 --node_rank=1 --master_addr="${MASTER_IP_ADDRESS}" --master_port=${port} --use_env main.py --cfg_file ./parameters/train/default.yaml
Owner
趙 漠居(Zhao, Moju)
Project Lecture in the Uiversity of Tokyo.
趙 漠居(Zhao, Moju)
StyleGAN2-ada for practice

This version of the newest PyTorch-based StyleGAN2-ada is intended mostly for fellow artists, who rarely look at scientific metrics, but rather need a working creative tool. Tested on Python 3.7 + Py

vadim epstein 170 Nov 16, 2022
This is the official released code for our paper, The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos

The-Emergence-of-Objectness This is the official released code for our paper, The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos

44 Oct 08, 2022
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

DART Implementation for ICLR2022 paper Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners. Environment

ZJUNLP 83 Dec 27, 2022
Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

DIFFNet This repo is for Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021 A new backbone for self-supervised de

Hang 94 Dec 25, 2022
In this project, two programs can help you take full agvantage of time on the model training with a remote server

In this project, two programs can help you take full agvantage of time on the model training with a remote server, which can push notification to your phone about the information during model trainin

GrayLee 8 Dec 27, 2022
https://sites.google.com/cornell.edu/recsys2021tutorial

Counterfactual Learning and Evaluation for Recommender Systems (RecSys'21 Tutorial) Materials for "Counterfactual Learning and Evaluation for Recommen

yuta-saito 45 Nov 10, 2022
DuBE: Duple-balanced Ensemble Learning from Skewed Data

DuBE: Duple-balanced Ensemble Learning from Skewed Data "Towards Inter-class and Intra-class Imbalance in Class-imbalanced Learning" (IEEE ICDE 2022 S

6 Nov 12, 2022
VIsually-Pivoted Audio and(N) Text

VIP-ANT: VIsually-Pivoted Audio and(N) Text Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowled

Yän.PnG 16 Nov 04, 2022
Towards Long-Form Video Understanding

Towards Long-Form Video Understanding Chao-Yuan Wu, Philipp Krähenbühl, CVPR 2021 [Paper] [Project Page] [Dataset] Citation @inproceedings{lvu2021,

Chao-Yuan Wu 69 Dec 26, 2022
Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

MetaMorph: Learning Universal Controllers with Transformers This is the code for the paper MetaMorph: Learning Universal Controllers with Transformers

Agrim Gupta 50 Jan 03, 2023
A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

Hypercomplex A Python library for working with quaternions, octonions, sedenions, and beyond following the Cayley-Dickson construction of hypercomplex

7 Nov 04, 2022
This repo is a C++ version of yolov5_deepsort_tensorrt. Packing all C++ programs into .so files, using Python script to call C++ programs further.

yolov5_deepsort_tensorrt_cpp Introduction This repo is a C++ version of yolov5_deepsort_tensorrt. And packing all C++ programs into .so files, using P

41 Dec 27, 2022
GNN-based Recommendation Benchma

GRecX A Fair Benchmark for GNN-based Recommendation Preliminary Comparison DiffNet-Yelp dataset (featureless) Algo 73 Oct 17, 2022

Deep Q-learning for playing chrome dino game

[PYTORCH] Deep Q-learning for playing Chrome Dino

Viet Nguyen 68 Dec 05, 2022
A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.

Automatic_Background_Remover A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku. 👉 https:

Gaurav 16 Oct 29, 2022
Notebook and code to synthesize complex and highly dimensional datasets using Gretel APIs.

Gretel Trainer This code is designed to help users successfully train synthetic models on complex datasets with high row and column counts. The code w

Gretel.ai 24 Nov 03, 2022
Gans-in-action - Companion repository to GANs in Action: Deep learning with Generative Adversarial Networks

GANs in Action by Jakub Langr and Vladimir Bok List of available code: Chapter 2: Colab, Notebook Chapter 3: Notebook Chapter 4: Notebook Chapter 6: C

GANs in Action 914 Dec 21, 2022
Implementation of Nyström Self-attention, from the paper Nyströmformer

Nyström Attention Implementation of Nyström Self-attention, from the paper Nyströmformer. Yannic Kilcher video Install $ pip install nystrom-attention

Phil Wang 95 Jan 02, 2023
Space-event-trace - Tracing service for spaceteam events

space-event-trace Tracing service for TU Wien Spaceteam events. This service is

TU Wien Space Team 2 Jan 04, 2022