Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Last update: Aug 03, 2022

Overview

SSWS-loss_function_based_on_MS-TCN

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Abstract

Recently, more and more videos have been uploaded to the network, so that video analysis task has been one of the most important applications in various fields. At present, video analysis methods can be divided into two kinds: weakly supervised video action segmentation and supervised video action segmentation. The former uses a sliding window or Markov model, while the latter uses the TCN model. In this paper, we introduce the Supervised Sliding Window Smooth Loss Function (SSWS) into the TCN baseline, which is a complement to MS-TCN smoothing loss function TMSE. In this method, three discriminant frames are selected from the video prediction sequence and combined into an adaptive sliding window to selectively smooth the whole prediction sequence. In particular, it doubles the penalty when it slides to the wrong place in the category. Compared to TMSE, our method effectively increases the receptive field of smoothing loss function. And, the proposed new supervised loss function only penalizes error frames. The experiment shows that compared with the Smoothing loss function TMSE of MS-TCN, SSWS has significantly improved in the three datasets: 50Salads, GTEA and the Breakfast Dataset.

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Related tags

Overview

SSWS-loss_function_based_on_MS-TCN

Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

Abstract

Owner

YouRefIt: Embodied Reference Understanding with Language and Gesture

Official code for "Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes", CVPR2022

Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Pytorch domain adaptation package

Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

🤗 Push your spaCy pipelines to the Hugging Face Hub

Simple-Neural-Network From Scratch in Python

Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

Statsmodels: statistical modeling and econometrics in Python

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

Time Series Forecasting with Temporal Fusion Transformer in Pytorch

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

PyTorch implementation of U-TAE and PaPs for satellite image time series panoptic segmentation.

Run Effective Large Batch Contrastive Learning on Limited Memory GPU

Repo for the Video Person Clustering dataset, and code for the associated paper

A curated list of Generative Deep Art projects, tools, artworks, and models

Semi-Supervised Learning for Fine-Grained Classification

Identify the emotion of multiple speakers in an Audio Segment