Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Last update: Dec 31, 2022

Related tags

Overview

2D-TAN (Optimized)

Introduction

This is an optimized re-implementation repository for AAAI'2020 paper: Learning 2D Temporal Localization Networks for Moment Localization with Natural Language.

We show advantages in speed and performance compared with the official implementation (https://github.com/microsoft/2D-TAN).

Comparison

Performance: Better Results

1. TACoS Dataset

Repo	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]
Official	47.59	37.29	25.32	70.31	57.81	45.04
Ours	57.54	45.36	31.87	77.88	65.83	54.29

2. ActivityNet Dataset

Repo	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]	[email protected]
Official	59.45	44.51	26.54	85.53	77.13	61.96
Ours	60.00	45.25	28.62	85.80	77.25	62.11

Speed and Cost: Faster Training/Inference, Less Memory Cost

1. Speed (ActivityNet Dataset)

Repo	Training	Inferece	Required Training Epoches
Official	1.98 s/batch	0.81 s/batch	100
Ours	1.50 s/batch	0.61 s/batch	5

2. Memory Cost (ActivityNet Dataset)

Repo	Training	Inferece
Official	4*10145 MB/batch	4*3065 MB/batch
Ours	*45345 MB/batch**	*42121 MB/batch**

Note: These results are measured on 4 NVIDIA Tesla V100 GPUs, with batch size 32.

Installation

The installation for this repository is easy. Please refer to INSTALL.md.

Dataset

Please refer to DATASET.md to prepare datasets.

Quick Start

We provide scripts for simplifying training and inference. Please refer to scripts/train.sh, scripts/eval.sh.

For example, if you want to train TACoS dataset, just modifying scripts/train.sh as follows:

# find all configs in configs/
model=2dtan_128x128_pool_k5l8_tacos
# set your gpu id
gpus=0,1,2,3
# number of gpus
gpun=4
# please modify it with different value (e.g., 127.0.0.2, 29502) when you run multi 2dtan task on the same machine
master_addr=127.0.0.1
master_port=29501
...

Another example, if you want to evaluate on ActivityNet dataset, just modifying scripts/eval.sh as follows:

# find all configs in configs/
config_file=configs/2dtan_64x64_pool_k9l4_activitynet.yaml
# the dir of the saved weight
weight_dir=outputs/2dtan_64x64_pool_k9l4_activitynet
# select weight to evaluate
weight_file=model_1e.pth
# test batch size
batch_size=32
# set your gpu id
gpus=0,1,2,3
# number of gpus
gpun=4
# please modify it with different value (e.g., 127.0.0.2, 29502) when you run multi 2dtan task on the same machine
master_addr=127.0.0.2
master_port=29502
...

Support

Please open a new issue. We would like to answer it. Please feel free to contact me: [email protected] if you need my help.

Acknowledgements

We greatly appreciate the official 2D-Tan repository https://github.com/microsoft/2D-TAN and maskrcnn-benchmark https://github.com/facebookresearch/maskrcnn-benchmark. We learned a lot from them. Moreover, please remember to cite the paper:

@InProceedings{2DTAN_2020_AAAI,
author = {Zhang, Songyang and Peng, Houwen and Fu, Jianlong and Luo, Jiebo},
title = {Learning 2D Temporal Adjacent Networks forMoment Localization with Natural Language},
booktitle = {AAAI},
year = {2020}
}

Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Related tags

Overview

2D-TAN (Optimized)

Introduction

Comparison

Performance: Better Results

Speed and Cost: Faster Training/Inference, Less Memory Cost

Installation

Dataset

Quick Start

Support

Acknowledgements

Owner

Joya Chen

Implementation of MA-Trace - a general-purpose multi-agent RL algorithm for cooperative environments.

Deep Residual Networks with 1K Layers

Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Code implementation of "Sparsity Probe: Analysis tool for Deep Learning Models"

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Platform-agnostic AI Framework 🔥

A distributed, plug-n-play algorithm for multi-robot applications with a priori non-computable objective functions

DNA-RECON { Automatic Web Reconnaissance Tool }

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Experiments with differentiable stacks and queues in PyTorch

Neural models of common sense. 🤖

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

Learned Initializations for Optimizing Coordinate-Based Neural Representations

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets

ICS 4u HD project, start before-wards. A curtain shooting game using python.

Convert human motion from video to .bvh