WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking [Paper Link]

Abstract

In this work, we contribute a new million-scale Unmanned Aerial Vehicle (UAV) tracking benchmark, called WebUAV-3M. Firstly, we collect 4,485 videos with more than 3M frames from the Internet. Then, an efficient and scalable Semi-Automatic Target Annotation (SATA) pipeline is devised to label the tremendous WebUAV-3M in every frame. To the best of our knowledge, the densely bounding box annotated WebUAV-3M is by far the largest public UAV tracking benchmark. We expect to pave the way for the follow-up study in the UAV tracking by establishing a million-scale annotated benchmark covering a wide range of target categories. Moreover, considering the close connections among visual appearance, natural language and audio, we enrich WebUAV-3M by providing natural language specification and audio description, encouraging the exploration of natural language features and audio cues for UAV tracking. Equipped with this benchmark, we delve into million-scale deep UAV tracking problems, aiming to provide the community with a dedicated large-scale benchmark for training deep UAV trackers and evaluating UAV tracking approaches. Extensive experiments on WebUAV-3M demonstrate that there is still a big room for robust deep UAV tracking improvements. The dataset, toolkits and baseline results will be available at this page.

WebUAV-3M dataset

Dataset coming here soon...

Evaluation toolkits

Toolkits coming here soon...

Baseline results

Results coming here soon...

Environment

The experiments are implemented using PyTorch or MATLAB with an Intel (R) Xeon (R) Gold 6230R CPU @ 2.10GHz and three NVIDIA RTX A5000 GPUs on an Ubuntu 18.04 server.

Citation

If you find the dataset and toolkits useful in your research, please consider citing:

@inproceedings{WebUAV_3M_2022,
    title={WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking},
    author = {Chunhui Zhang, and Guanjie Huang, and Li Liu, and Shan Huang, and Yinan Yang, and Yuxuan Zhang, and Xiang Wan, and Shiming Ge},
    journal = {arXiv:2201.07425},
    year = {2022}
  }

Acknowledgments

Thanks for the great [GOT-10k toolkit]

WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking

Related tags

Overview

WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking [Paper Link]

Abstract

WebUAV-3M dataset

Evaluation toolkits

Baseline results

Environment

Citation

Acknowledgments

Owner

Deep learning models for change detection of remote sensing images

It helps user to learn Pick-up lines and share if he has a better one

FairMOT - A simple baseline for one-shot multi-object tracking

Unsupervised Feature Ranking via Attribute Networks.

The versatile ocean simulator, in pure Python, powered by JAX.

SlotRefine: A Fast Non-Autoregressive Model forJoint Intent Detection and Slot Filling

Code, Data and Demo for Paper: Controllable Generation from Pre-trained Language Models via Inverse Prompting

A script depending on VASP output for calculating Fermi-Softness.

High-performance moving least squares material point method (MLS-MPM) solver.

Implementation of Invariant Point Attention, used for coordinate refinement in the structure module of Alphafold2, as a standalone Pytorch module

learning and feeling SLAM together with hands-on-experiments

Code & Data for the Paper "Time Masking for Temporal Language Models", WSDM 2022

Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

Writeups for the challenges from DownUnderCTF 2021

Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

A simple version for graphfpn

Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included.

This repository contains a PyTorch implementation of "AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis".