Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)

Overview

DHF1K

===========================================================================

Wenguan Wang, J. Shen, M.-M Cheng and A. Borji,

Revisiting Video Saliency: A Large-scale Benchmark and a New Model,

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 and

IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2019

===========================================================================

The code (ACLNet) and dataset (DHF1K with raw gaze records, UCF-sports are new added!) can be downloaded from:

Google disk:https://drive.google.com/open?id=1sW0tf9RQMO4RR7SyKhU8Kmbm4jwkFGpQ

Baidu pan: https://pan.baidu.com/s/110NIlwRIiEOTyqRwYdDnVg

The Hollywood-2 (74.6G, including attention maps) can be downloaded from:

Google disk:https://drive.google.com/file/d/1vfRKJloNSIczYEOVjB4zMK8r0k4VJuWk/view?usp=sharing

Baidu pan: link:https://pan.baidu.com/s/16BIAuaGEDDbbjylJ8zziuA code:bt3x

Since so many people are interested in the training code, I decide to upload it in above webdisks. Enjoy it.

===========================================================================

Files:

'video': 1000 videos (videoname.AVI)

'annotation/videoname/maps': continuous saliency maps in '.png' format

'annotation/videoname/fixation': binary eye fixation maps in '.png' format

'annotation/videoname/maps': binary eye fixation maps stored in mat file

'generate_frame.m': used for extracting the frame images from AVI videos.

Please note raw data of individual viewers are stored in 'exportdata_train.rar'.

Note that please do not change the way of naming frames.

===========================================================================

Dataset splitting:

Training set: first 600 videos (001.AVI-600.AVI)

Validation set: 100 videos (601.AVI-700.AVI)

Testing set: 300 videos (701.AVI-1000.AVI)

The annotations for the training and val sets are released, but the

annotations of the testing set are held-out for benchmarking.

===========================================================================

We have corrected some statistics of our results (baseline training setting (iii)) on UCF sports dataset. Please see our newest version in ArXiv.

===========================================================================

Note that, for Holly-wood2 dataset, we used the split videos (each video only contains one shot), instead of the full videos.

===========================================================================

The raw data of gaze record "exportdata_train.rar" has been uploaded.

===========================================================================

For DHF1K dataset, we use following functions to generate continous saliency map:

[x,y]=find(fixations);

densityMap= make_gauss_masks(y,x,[video_res_y,video_res_x]);

make_gauss_masks.m has been uploaded.

For UCF and Hollywood, I directly use following functions:

densityMap = imfilter(fixations,fspecial('gaussian',150,20),'replicate');

===========================================================================

Results submission.

Please orgnize your results in following format:

yourmethod/videoname/framename.png

Note that the frames and framenames should be generated by 'generate_frame.m'.

Then send your results to '[email protected]'.

You can only sumbmit ONCE within One week.

Please first test your model on the val set or other video saliency dataset.

The response may be more than one week.

If you want to list your results on our web, please send your name, model

name, paper title, short description of your method and the link of the web

of your project (if you have).

===========================================================================

We use

Keras: 2.2.2

tensorflow: 1.10.0

to implement our model.

===========================================================================

Citation:

@InProceedings{Wang_2018_CVPR,
author = {Wang, Wenguan and Shen, Jianbing and Guo, Fang and Cheng, Ming-Ming and Borji, Ali},
title = {Revisiting Video Saliency: A Large-Scale Benchmark and a New Model},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition},
year = {2018}
}

@ARTICLE{Wang_2019_revisitingVS, 
author={W. {Wang} and J. {Shen} and J. {Xie} and M. {Cheng} and H. {Ling} and A. {Borji}}, 
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
title={Revisiting Video Saliency Prediction in the Deep Learning Era}, 
year={2019}, 
}

If you find our dataset is useful, please cite above papers.

===========================================================================

Code (ACLNet):

You can find the code in google disk: https://drive.google.com/open?id=1sW0tf9RQMO4RR7SyKhU8Kmbm4jwkFGpQ

===========================================================================

Terms of use:

The dataset and code are licensed under a Creative Commons Attribution 4.0 License.

===========================================================================

Contact Information Email: [email protected]


Owner
Wenguan Wang
Postdoctoral Scholar
Wenguan Wang
Imaging, analysis, and simulation software for radio interferometry

ehtim (eht-imaging) Python modules for simulating and manipulating VLBI data and producing images with regularized maximum likelihood methods. This ve

Andrew Chael 5.2k Dec 28, 2022
The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

Motion Compensated Pulse Rate Estimation Overview This project has 2 main parts. Develop a Pulse Rate Algorithm on the given training data. Then Test

Omar Laham 2 Oct 25, 2022
Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion Models

Label-Efficient Semantic Segmentation with Diffusion Models Official implementation of the paper Label-Efficient Semantic Segmentation with Diffusion

Yandex Research 355 Jan 06, 2023
AIR^2 for Interaction Prediction

This is the repository for AIR^2 for Interaction Prediction. Explanation of the solution: Video: link License AIR is released under the Apache 2.0 lic

21 Sep 27, 2022
Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).

What is judgyprophet? judgyprophet is a Bayesian forecasting algorithm based on Prophet, that enables forecasting while using information known by the

AstraZeneca 56 Oct 26, 2022
MutualGuide is a compact object detector specially designed for embedded devices

Introduction MutualGuide is a compact object detector specially designed for embedded devices. Comparing to existing detectors, this repo contains two

ZHANG Heng 103 Dec 13, 2022
Python package for downloading ECMWF reanalysis data and converting it into a time series format.

ecmwf_models Readers and converters for data from the ECMWF reanalysis models. Written in Python. Works great in combination with pytesmo. Citation If

TU Wien - Department of Geodesy and Geoinformation 31 Dec 26, 2022
The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

LEAR The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction". **The code is in the "master

杨攀 93 Jan 07, 2023
CLIP (Contrastive Language–Image Pre-training) for Italian

Italian CLIP CLIP (Radford et al., 2021) is a multimodal model that can learn to represent images and text jointly in the same space. In this project,

Italian CLIP 114 Dec 29, 2022
Implementation of "Selection via Proxy: Efficient Data Selection for Deep Learning" from ICLR 2020.

Selection via Proxy: Efficient Data Selection for Deep Learning This repository contains a refactored implementation of "Selection via Proxy: Efficien

Stanford Future Data Systems 70 Nov 16, 2022
Drone detection using YOLOv5

This drone detection system uses YOLOv5 which is a family of object detection architectures and we have trained the model on Drone Dataset. Overview I

Tushar Sarkar 27 Dec 20, 2022
Source code for the plant extraction workflow introduced in the paper “Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision”

Plant extraction workflow Source code for the plant extraction workflow introduced in the paper "Agricultural Plant Cataloging and Establishment of a

Maurice Günder 0 Apr 22, 2022
MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

Main repo for ECCV 2020 paper MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images. visual.cs.brown.edu/matryodshka

Brown University Visual Computing Group 75 Dec 13, 2022
An example of Scatterbrain implementation (combining local attention and Performer)

An example of Scatterbrain implementation (combining local attention and Performer)

HazyResearch 97 Jan 02, 2023
Kaggle: Cell Instance Segmentation

Kaggle: Cell Instance Segmentation The goal of this challenge is to detect cells in microscope images. with simple view on how many cels have been ann

Jirka Borovec 9 Aug 12, 2022
This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.

Lite-HRNet: A Lightweight High-Resolution Network Introduction This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution

HRNet 675 Dec 25, 2022
I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

An Image Captioning codebase This is a codebase for image captioning research. It supports: Self critical training from Self-critical Sequence Trainin

Ruotian(RT) Luo 1.3k Dec 31, 2022
Change Detection in SAR Images Based on Multiscale Capsule Network

SAR_CD_MS_CapsNet Code for the paper "Change Detection in SAR Images Based on Multiscale Capsule Network" , IEEE Geoscience and Remote Sensing Letters

Feng Gao 21 Nov 29, 2022
Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore

[AI6122] Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6122 of Semester 1, AY2021-2022, starting from 08/2021. The instructor of this course

HT. Li 5 Sep 12, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021