Background-Click Supervision for Temporal Action Localization

Related tags

Deep LearningBackTAL
Overview

Background-Click Supervision for Temporal Action Localization

This repository is the official implementation of BackTAL. In this work, we study the temporal action localization under background-click supervision, and find the performance bottleneck of the existing approaches mainly comes from the background errors. Thus, we convert existing action-click supervision to the background-click supervision and develop a novel method, called BackTAL. Extensive experiments on three benchmarks are conducted, which demonstrate the high performance of the established BackTAL and the rationality of the proposed background-click supervision.

Illustrating the architecture of the proposed BackTAL

Requirements

To install requirements:

conda env create -f environment.yaml

Data Preparation

Download

Download pre-extracted I3D features of Thumos14, ActivityNet1.2 and HACS dataset from BaiduYun with code back.

Please ensure the data structure is as below
├── data
   └── Thumos14
       ├── val
           ├── video_validation_0000051.npz
           ├── video_validation_0000052.npz
           └── ...
       └── test
           ├── video_test_0000004.npz
           ├── video_test_0000006.npz
           └── ...
   └── ActivityNet1.2
       ├── training
           ├── v___dXUJsj3yo.npz
           ├── v___wPHayoMgw.npz
           └── ...
       └── validation
           ├── v__3I4nm2zF5Y.npz
           ├── v__8KsVaJLOYI.npz
           └── ...
   └── HACS
       ├── training
           ├── v_0095rqic1n8.npz
           ├── v_62VWugDz1MY.npz
           └── ...
       └── validation
           ├── v_008gY2B8Pf4.npz
           ├── v_00BcXeG1gC0.npz
           └── ...
     

Background-Click Annotations

The raw annotations of THUMOS14 dataset are under directory './data/THUMOS14/human_anns'.

Evaluation

Pre-trained Models

You can download checkpoints for Thumos14, ActivityNet1.2 and HACS dataset from BaiduYun with code back. These models are trained on Thumos14, ActivityNet1.2 or HACS using the configuration file under the directory "./experiments/". Please put these checkpoints under directory "./checkpoints".

Evaluation

Before running the code, please activate the conda environment.

To evaluate BackTAL model on Thumos14, run:

cd ./tools
python eval.py -dataset THUMOS14 -weight_file ../checkpoints/THUMOS14.pth

To evaluate BackTAL model on ActivityNet1.2, run:

cd ./tools
python eval.py -dataset ActivityNet1.2 -weight_file ../checkpoints/ActivityNet1.2.pth

To evaluate BackTAL model on HACS, run:

cd ./tools
python eval.py -dataset HACS -weight_file ../checkpoints/HACS.pth

Results

Our model achieves the following performance:

THUMOS14

threshold 0.3 0.4 0.5 0.6 0.7
mAP 54.4 45.5 36.3 26.2 14.8

ActivityNet v1.2

threshold average-mAP 0.50 0.75 0.95
mAP 27.0 41.5 27.3 4.7

HACS

threshold average-mAP 0.50 0.75 0.95
mAP 20.0 31.5 19.5 4.7

Training

To train the BackTAL model on THUMOS14 dataset, please run this command:

cd ./tools
python train.py -dataset THUMOS14

To train the BackTAL model on ActivityNet v1.2 dataset, please run this command:

cd ./tools
python train.py -dataset ActivityNet1.2

To train the BackTAL model on HACS dataset, please run this command:

cd ./tools
python train.py -dataset HACS

Citing BackTAL

@article{yang2021background,
  title={Background-Click Supervision for Temporal Action Localization},
  author={Yang, Le and Han, Junwei and Zhao, Tao and Lin, Tianwei and Zhang, Dingwen and Chen, Jianxin},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2021},
  publisher={IEEE}
}

Contact

For any discussions, please contact [email protected].

Owner
LeYang
LeYang
FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR

This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Anton Jeran Ratnarajah 89 Dec 22, 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations

Label-Free Model Evaluation with Semi-Structured Dataset Representations Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch

8 Oct 06, 2022
Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

Patch2Pix for Accurate Image Correspondence Estimation This repository contains the Pytorch implementation of our paper accepted at CVPR2021: Patch2Pi

Qunjie Zhou 199 Nov 29, 2022
Geometric Algebra package for JAX

JAXGA - JAX Geometric Algebra GitHub | Docs JAXGA is a Geometric Algebra package on top of JAX. It can handle high dimensional algebras by storing onl

Robin Kahlow 36 Dec 22, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

Testability-Aware Low Power Controller Design with Evolutionary Learning This repo contains the source code of Testability-Aware Low Power Controller

Lee Man 1 Dec 26, 2021
PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)

Asym-Siam: On the Importance of Asymmetry for Siamese Representation Learning This is a PyTorch implementation of the Asym-Siam paper, CVPR 2022: @inp

Meta Research 89 Dec 18, 2022
Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

Akash Sengupta 149 Dec 14, 2022
Repo público onde postarei meus estudos de Python, buscando aprender por meio do compartilhamento do aprendizado!

Seja bem vindo à minha repo de Estudos em Python 3! Este é um repositório criado por um programador amador que estuda tópicos de finanças, estatística

32 Dec 24, 2022
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation

DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation This repository is the implementation of DynaTune paper. This folder

4 Nov 02, 2022
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

Pranav 39 Nov 21, 2022
A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

R-YOLOv4 This is a PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detect

94 Dec 03, 2022
This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

4 Aug 02, 2022
A gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor.

OpenHands OpenHands is a gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor. Currently the system can iden

Paul Treanor 12 Jan 10, 2022
This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

Zhong Peixiang 35 Nov 17, 2022
Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

COCON_ICLR2021 This is our Pytorch implementation of COCON. CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021) Alvin Chan, Y

alvinchangw 79 Dec 18, 2022
Replication attempt for the Protein Folding Model

RGN2-Replica (WIP) To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding f

Eric Alcaide 36 Nov 29, 2022
《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

53 Dec 16, 2022
Neural Scene Flow Prior (NeurIPS 2021 spotlight)

Neural Scene Flow Prior Xueqian Li, Jhony Kaesemodel Pontes, Simon Lucey Will appear on Thirty-fifth Conference on Neural Information Processing Syste

Lilac Lee 85 Jan 03, 2023
A tutorial on training a DarkNet YOLOv4 model for the CrowdHuman dataset

YOLOv4 CrowdHuman Tutorial This is a tutorial demonstrating how to train a YOLOv4 people detector using Darknet and the CrowdHuman dataset. Table of c

JK Jung 118 Nov 10, 2022