Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning

Related tags

Deep LearningSGN
Overview

Semantic Grouping Network for Video Captioning

Hobin Ryu, Sunghun Kang, Haeyong Kang, and Chang D. Yoo. AAAI 2021. [arxiv]

Environment

  • Ubuntu 16.04
  • CUDA 9.2
  • cuDNN 7.4.2
  • Java 8
  • Python 2.7.12
    • PyTorch 1.1.0
    • Other python packages specified in requirements.txt

Usage

1. Setup

$ pip install -r requirements.txt

2. Prepare Data

  1. Download the GloVe Embedding from here and locate it at data/Embeddings/GloVe/GloVe_300.json.

  2. Extract features from datasets and locate them at data/ /features/ .hdf5 .

    e.g. ResNet101 features of the MSVD dataset will be located at data/MSVD/features/ResNet101.hdf5.

    I refer to this repo for extracting the ResNet101 features, and this repo for extracting the 3D-ResNext101 features.

  3. Split the features into train, val, and test sets by running following commands.

    $ python -m split.MSVD
    $ python -m split.MSR-VTT
    

You can skip step 2-3 and download below files

3. Prepare The Code for Evaluation

Clone the evaluation code from the official coco-evaluation repo.

$ git clone https://github.com/tylin/coco-caption.git
$ mv coco-caption/pycocoevalcap .
$ rm -rf coco-caption

4. Extract Negative Videos

$ python extract_negative_videos.py

or you can skip this step as the output files are already uploaded at data/ /metadata/neg_vids_ .json

5. Train

$ python train.py

You can change some hyperparameters by modifying config.py.

Pretrained Models - SGN(R101+RN)

*Disclaimer: The models above do not have the same weight as the models used in the paper (I trained them again because I lost).

6. Evaluate

$ python evaluate.py --ckpt_fpath 
   

   

License

The source-code in this repository is released under MIT License.

Owner
Hobin Ryu
Hobin Ryu
Neural Contours: Learning to Draw Lines from 3D Shapes (CVPR2020)

Neural Contours: Learning to Draw Lines from 3D Shapes This repository contains the PyTorch implementation for CVPR 2020 Paper "Neural Contours: Learn

93 Dec 16, 2022
A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented in Python.

Reinforcement-Learning-Notebooks A collection of Reinforcement Learning algorithms from Sutton and Barto's book and other research papers implemented

Pulkit Khandelwal 1k Dec 28, 2022
A deep learning based semantic search platform that computes similarity scores between provided query and documents

semanticsearch This is a deep learning based semantic search platform that computes similarity scores between provided query and documents. Documents

1 Nov 30, 2021
Codecov coverage standard for Python

Python-Standard Last Updated: 01/07/22 00:09:25 What is this? This is a Python application, with basic unit tests, for which coverage is uploaded to C

Codecov 10 Nov 04, 2022
Source code for "OmniPhotos: Casual 360° VR Photography"

OmniPhotos: Casual 360° VR Photography Project Page | Video | Paper | Demo | Data This repository contains the source code for creating and viewing Om

Christian Richardt 144 Dec 30, 2022
Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.

Project page for the paper Semi-Supervised Raw-to-Raw Mapping 2021.

Mahmoud Afifi 22 Nov 08, 2022
My implementation of Fully Convolutional Neural Networks in Keras

Keras-FCN This repository contains my implementation of Fully Convolutional Networks in Keras (Tensorflow backend). Currently, semantic segmentation c

The Duy Nguyen 15 Jan 13, 2020
Direct Multi-view Multi-person 3D Human Pose Estimation

Implementation of NeurIPS-2021 paper: Direct Multi-view Multi-person 3D Human Pose Estimation [paper] [video-YouTube, video-Bilibili] [slides] This is

Sea AI Lab 251 Dec 30, 2022
Efficient semidefinite bounds for multi-label discrete graphical models.

Low rank solvers #################################### benchmark/ : folder with the random instances used in the paper. ############################

1 Dec 08, 2022
A pytorch reproduction of { Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation }.

A PyTorch Reproduction of HCN Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation. Ch

Guyue Hu 210 Dec 31, 2022
Using Machine Learning to Create High-Res Fine Art

BIG.art: Using Machine Learning to Create High-Res Fine Art How to use GLIDE and BSRGAN to create ultra-high-resolution paintings with fine details By

Robert A. Gonsalves 13 Nov 27, 2022
Simple improvement of VQVAE that allow to generate x2 sized images compared to baseline

vqvae_dwt_distiller.pytorch Simple improvement of VQVAE that allow to generate x2 sized images compared to baseline. It allows to generate 512x512 ima

Sergei Belousov 25 Jul 19, 2022
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Tianxiang Sun 149 Jan 04, 2023
How to Leverage Multimodal EHR Data for Better Medical Predictions?

How to Leverage Multimodal EHR Data for Better Medical Predictions? This repository contains the code of the paper: How to Leverage Multimodal EHR Dat

13 Dec 13, 2022
style mixing for animation face

An implementation of StyleGAN on Animation dataset. Install git clone https://github.com/MorvanZhou/anime-StyleGAN cd anime-StyleGAN pip install -r re

Morvan 46 Nov 30, 2022
Spam your friends and famly and when you do your famly will disown you and you will have no friends.

SpamBot9000 Spam your friends and family and when you do your family will disown you and you will have no friends. Terms of Use Disclaimer: Please onl

DJ15 0 Jun 09, 2022
Numerical Methods with Python, Numpy and Matplotlib

Numerical Bric-a-Brac Collections of numerical techniques with Python and standard computational packages (Numpy, SciPy, Numba, Matplotlib ...). Diffe

Vincent Bonnet 10 Dec 20, 2021
This repository contains the code for designing risk bounded motion plans for car-like robot using Carla Simulator.

Nonlinear Risk Bounded Robot Motion Planning This code simulates the bicycle dynamics of car by steering it on the road by avoiding another static car

8 Sep 03, 2022
SelfAugment extends MoCo to include automatic unsupervised augmentation selection.

SelfAugment extends MoCo to include automatic unsupervised augmentation selection. In addition, we've included the ability to pretrain on several new datasets and included a wandb integration.

Colorado Reed 24 Oct 26, 2022
Bayesian Meta-Learning Through Variational Gaussian Processes

vmgp This is the repository of Vivek Myers and Nikhil Sardana for our CS 330 final project, Bayesian Meta-Learning Through Variational Gaussian Proces

Vivek Myers 2 Nov 17, 2022