A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

Last update: Nov 29, 2022

Related tags

Deep Learning CVPR2021_VSPW_Implement

Overview

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

Preparation

Download VSPW dataset

The VSPW dataset with extracted frames and masks is available here. Now you can directly download VSPW_480P dataset.

Dependencies

Python 3.7
Pytorch 1.3.1
Numpy

Download the ImageNet-pretrained models at this link. Put it in the root folder and decompress it.

Train and Test

Resize the frames and masks of the VSPW dataset to 480p.

python change2_480p.py

Edit the .sh files in scripts/ and change the $DATAROOT to your path to VSPW_480p.

Image-based methods

PSPNet

sh scripts/run_psp.sh

OCRNet

sh scripts/run_ocr.sh

Video-based methods

TCB-PSP

sh run_temporal_psp.sh

TCB-OCR

sh run_temporal_ocr.sh

Evaluation on TC and VC

Change dataroot and prediction root in TC_cal.py and VC_perclip.py.

python TC_cal.py

python VC_perclip.py

This implementation utilized this code and RAFT.

Citation

@inproceedings{miao2021vspw,

  title={VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild},

  author={Miao, Jiaxu and Wei, Yunchao and  Wu, Yu and Liang, Chen and Li, Guangrui and Yang, Yi},

  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},

  year={2021}

}

A pytorch implementation of the CVPR2021 paper "VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild"

Related tags

Overview

VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild

Preparation

Download VSPW dataset

Dependencies

Train and Test

Image-based methods

Video-based methods

Evaluation on TC and VC

Citation

Owner

Extracting knowledge graphs from language models as a diagnostic benchmark of model performance.

Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers [CVPR 2021]

YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931)

Code for Paper Predicting Osteoarthritis Progression via Unsupervised Adversarial Representation Learning

Anderson Acceleration for Deep Learning

Recursive Bayesian Networks

Food recognition model using convolutional neural network & computer vision

QueryFuzz implements a metamorphic testing approach to test Datalog engines.

An open-source Kazakh named entity recognition dataset (KazNERD), annotation guidelines, and baseline NER models.

A high-level Python library for Quantum Natural Language Processing

Code to replicate the key results from Exploring the Limits of Out-of-Distribution Detection

Code repository for "Reducing Underflow in Mixed Precision Training by Gradient Scaling" presented at IJCAI '20

Code for Dual Contrastive Learning for Unsupervised Image-to-Image Translation, NTIRE, CVPRW 2021.

CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

A toy compiler that can convert Python scripts to pickle bytecode 🥒

This repository contains a CBIR system that uses swin transformer to extract image's feature.

Generative Flow Networks for Discrete Probabilistic Modeling

Python codes for Lite Audio-Visual Speech Enhancement.