Blind visual quality assessment on 360° Video based on progressive learning

Related tags

Deep LearningProVQA
Overview

Blind visual quality assessment on omnidirectional or 360 video (ProVQA)

Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and Video

This repository contains the official PyTorch implementation of the following paper:

Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and Video
Li Yang, Mai Xu, ShengXi Li, YiChen Guo and Zulin Wang (School of Electronic and Information Engineering, Beihang University)
Paper link: https://arxiv.org/abs/2111.09503
Abstract: Blind visual quality assessment (BVQA) on 360° video plays a key role in optimizing immersive multimedia systems. When assessing the quality of 360° video, human tends to perceive its quality degradation from the viewport-based spatial distortion of each spherical frame to motion artifact across adjacent frames, ending with the video-level quality score, i.e., a progressive quality assessment paradigm. However, the existing BVQA approaches for 360° video neglect this paradigm. In this paper, we take into account the progressive paradigm of human perception towards spherical video quality, and thus propose a novel BVQA approach (namely ProVQA) for 360° video via progressively learning from pixels, frames and video. Corresponding to the progressive learning of pixels, frames and video, three sub-nets are designed in our ProVQA approach, i.e., the spherical perception aware quality prediction (SPAQ), motion perception aware quality prediction (MPAQ) and multi-frame temporal non-local (MFTN) sub-nets. The SPAQ sub-net first models the spatial quality degradation based on spherical perception mechanism of human. Then, by exploiting motion cues across adjacent frames, the MPAQ sub-net properly incorporates motion contextual information for quality assessment on 360° video. Finally, the MFTN sub-net aggregates multi-frame quality degradation to yield the final quality score, via exploring long-term quality correlation from multiple frames. The experiments validate that our approach significantly advances the state-of-the-art BVQA performance on 360° video over two datasets, the code of which has been public in \url{https://github.com/yanglixiaoshen/ProVQA.}
Note: Since this paper is under review, you can first ask for the paper from me to ease the implementation of this project but you have no rights to use this paper in any purpose. Unauthorized use of this article for all activities will be investigated for legal responsibility. Contact me for accessing my paper (Email: [email protected])

Preparation

Requriments

First, download the conda environment of ProVQA from ProVQA_dependency and install my conda enviroment <envs> in Linux sys (Ubuntu 18.04+); Then, activate <envs> by running the following command:

conda env create -f ProVQA_environment.yaml

Second, install all dependencies by running the following command:

pip install -r ProVQA_environment.txt

If the above installation don't work, you can download the environment file with .tar.gz format. Then, unzip the file into a directory (e.g., pro_env) in your home directiory and activate the environment every time before you run the code.

source activate /home/xxx/pro_env

Implementation

The architecture of the proposed ProVQA is shown in the following figure, which contains four novel modules, i.e., SPAQ, MPAQ, MFTN and AQR.

Dataset

We trained our ProVQA on the large-scale 360° VQA dataset VQA-ODV, which includes 540 impaired 360° videos deriving from 60 reference 360° videos under equi-rectangular projection (ERP) (Training set: 432-Testing set:108). Besides, we also evaluate the performance of our ProVQA over 144 distorted 360° videos in BIT360 dataset.

Training the ProVQA

Our network is implemented based on the PyTorch framework, and run on two NVIDIA Tesla V100 GPUs with 32G memory. The number of sampled frames is 6 and the batch size is 3 per GPU for each iteration. The training set of VQA-ODV dataset has been packed as an LMDB file ODV-VQA_Train, which is used in our approach.

First, to run the training code as follows,

CUDA_VISIBLE_DEVICES=0,1 python ./train.py -opt ./options/train/bvqa360_240hz.yaml

Note that all the settings of dataset, training implementation and network can be found in "bvqa360_240hz.yaml". You can modify the settings to satisfy your experimental environment, for example, the dataset path should be modified to be your sever path. For the final BVQA result, we choose the trained model at iter=26400, which can be download at saved_model. Moreover, the corresponding training state can be obtained at saved_optimized_state.

Testing the ProVQA

Download the saved_model and put it in your own experimental directory. Then, run the following code for evaluating the BVQA performance over the testing set ODV-VQA_TEST. Note that all the settings of testing set, testing implementation and results can be found in "test_bvqa360_OURs.yaml". You can modify the settings to satisfy your experimental environment.

CUDA_VISIBLE_DEVICES=0 python ./test.py -opt ./options/test/test_bvqa360_OURs.yaml

The test results of predicted quality scores of all test 360° Video frames can be found in All_frame_scores and latter you should run the following code to generate the final 108 scores corresponding to 108 test 360° Videos, which can be downloaded from predicted_DMOS.

python ./evaluate.py

Evaluate BVQA performance

We have evaluate the BVQA performance for 360° Videos by 5 general metrics: PLCC, SROCC, KROCC, RMSE and MAE. we employ a 4-order logistic function for fitting the predicted quality scores to their corresponding ground truth, such that the fitted scores have the same scale as the ground truth DMOS gt_dmos. Note that the fitting procedure are conducted on our and all compared approaches. Run the code bvqa360_metric in the following command :

./bvqa360_metric1.m

As such, you can get the final results of PLCC=0.9209, SROCC=0.9236, KROCC=0.7760, RMSE=4.6165 and MAE=3.1136. The following tables shows the comparison on BVQA performance between our and other 13 approaches, over VQA-ODV and BIT360 dataset.

Tips

(1) We have summarized the information about how to run the compared algorithms in details, which can be found in the file "compareAlgoPreparation.txt".
(2) The details about the pre-processing on the ODV-VQA dataset and BIT360 dataset can be found in the file "pre_process_dataset.py".

Citation

If this repository can offer you help in your research, please cite the paper:

@misc{yang2021blind,
      title={Blind VQA on 360{\deg} Video via Progressively Learning from Pixels, Frames and Video}, 
      author={Li Yang and Mai Xu and Shengxi Li and Yichen Guo and Zulin Wang},
      year={2021},
      eprint={2111.09503},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

  1. https://github.com/xinntao/EDVR
  2. https://github.com/AlexHex7/Non-local_pytorch
  3. https://github.com/ChiWeiHsiao/SphereNet-pytorch

Please enjoy it and best wishes. Plese contact with me if you have any questions about the ProVQA approach.

My email address is 13021041[at]buaa[dot]edu[dot]cn

Code for ACL 2019 Paper: "COMET: Commonsense Transformers for Automatic Knowledge Graph Construction"

To run a generation experiment (either conceptnet or atomic), follow these instructions: First Steps First clone, the repo: git clone https://github.c

Antoine Bosselut 575 Jan 01, 2023
A Python package for performing pore network modeling of porous media

Overview of OpenPNM OpenPNM is a comprehensive framework for performing pore network simulations of porous materials. More Information For more detail

PMEAL 336 Dec 30, 2022
Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

VT-UNet This repo contains the supported pytorch code and configuration files to reproduce 3D medical image segmentaion results of VT-UNet. Environmen

Himashi Amanda Peiris 114 Dec 20, 2022
CNN Based Meta-Learning for Noisy Image Classification and Template Matching

CNN Based Meta-Learning for Noisy Image Classification and Template Matching Introduction This master thesis used a few-shot meta learning approach to

Kumar Manas 2 Dec 09, 2021
A motion detection system with RaspberryPi, OpenCV, Python

Human Detection System using Raspberry Pi Functionality Activates a relay on detecting motion. You may need following components to get the expected R

Omal Perera 55 Dec 04, 2022
68 keypoint annotations for COFW test data

68 keypoint annotations for COFW test data This repository contains manually annotated 68 keypoints for COFW test data (original annotation of CFOW da

31 Dec 06, 2022
(JMLR' 19) A Python Toolbox for Scalable Outlier Detection (Anomaly Detection)

Python Outlier Detection (PyOD) Deployment & Documentation & Stats & License PyOD is a comprehensive and scalable Python toolkit for detecting outlyin

Yue Zhao 6.6k Jan 05, 2023
Compact Bilinear Pooling for PyTorch

Compact Bilinear Pooling for PyTorch. This repository has a pure Python implementation of Compact Bilinear Pooling and Count Sketch for PyTorch. This

Grégoire Payen de La Garanderie 234 Dec 07, 2022
mPose3D, a mmWave-based 3D human pose estimation model.

mPose3D, a mmWave-based 3D human pose estimation model.

KylinChen 35 Nov 08, 2022
Keeper for Ricochet Protocol, implemented with Apache Airflow

Ricochet Keeper This repository contains Apache Airflow DAGs for executing keeper operations for Ricochet Exchange. Usage You will need to run this us

Ricochet Exchange 5 May 24, 2022
Dataloader tools for language modelling

Installation: pip install lm_dataloader Design Philosophy A library to unify lm dataloading at large scale Simple interface, any tokenizer can be inte

5 Mar 25, 2022
Dynamic Attentive Graph Learning for Image Restoration, ICCV2021 [PyTorch Code]

Dynamic Attentive Graph Learning for Image Restoration This repository is for GATIR introduced in the following paper: Chong Mou, Jian Zhang, Zhuoyuan

Jian Zhang 84 Dec 09, 2022
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

254 Dec 27, 2022
TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

TensorFlow Similarity is a python package focused on making similarity learning quick and easy.

912 Jan 08, 2023
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022
Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust.

Subspace Adversarial Training Single-step adversarial training (AT) has received wide attention as it proved to be both efficient and robust. However,

15 Sep 02, 2022
deep learning model that learns to code with drawing in the Processing language

sketchnet sketchnet - processing code generator can we teach a computer to draw pictures with code. We use Processing and java/jruby code paired with

41 Dec 12, 2022
Keras udrl - Keras implementation of Upside Down Reinforcement Learning

keras_udrl Keras implementation of Upside Down Reinforcement Learning This is me

Eder Santana 7 Jan 24, 2022
To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types

To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types, from a Database Taken From Dr. Wolberg reports his Clinic Cases.

Astitva Veer Garg 1 Jul 31, 2022
AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614

AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614 AquaTimer is a programmable timer for 12V devices such as lighting, solenoid

Stefan Wagner 4 Jun 13, 2022