PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency"

Last update: Dec 08, 2022

Related tags

Overview

Improving Generation and Evaluation of Visual Stories via Semantic Consistency

PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency". Link to arXiv paper: https://arxiv.org/abs/2105.10026

Requirements:

This code has been tested on torch==1.7.1 and torchvision==0.8.2

Prepare Repository:

Download the PororoSV dataset and associated files from here and save it as ./data. Download GloVe embeddings (glove.840B.300D) from here. The default location of the embeddings is ./data/ (see ./dcsgan/miscc/config.py).

Training DuCo-StoryGAN:

To train DuCo-StoryGAN, first train the VideoCaptioning model on the PororoSV dataset:
python train_mart.py --data_dir
Default parameters were used to train the model used in our paper.

Next, train the generative model:
python train_gan.py --cfg ./cfg/pororo_s1_duco.yml --data_dir
If training DuCo-StoryGAN on a new dataset, make sure to train the Video Captioning model (see below) before training the GAN. The vocabulary file prepared for the video-captioning model is re-used for generating common input_ids for both models. Change location of video captioning checkpoint in config file.

Unless specified, the default output root directory for all model checkpoints is ./out/

Training Evaluation Models:

Video Captioning Model
The video captioning model trained for DuCo-StoryGAN (see above) is used for evaluation. python train_mart.py --data_dir
Hierarchical Deep Multimodal Similarity (H-DAMSM)
python train_damsm.py --cfg ./cfg/pororo_damsm.yml --data_dir
Character Classifier
python train_classifier.py --data_dir --model_name inception --save_path ./models/inception --batch_size 8 --learning_rate 1e-05

Inference from DuCo-StoryGAN:

Use the following command to infer from trained weights for DuCo-StoryGAN:
python train_gan.py --cfg ./cfg/pororo_s1_duco_eval.yml --data_dir --checkpoint --infer_dir

Download our pretrained checkpoint from here.

Evaluation:

Download the pretrained models for evaluations:
Character Classifier, Video Captioning

Use the following command to evaluate classification accuracy of generated images:
python eval_scripts/eval_classifier.py --image_path --data_dir --model_path --model_name inception --mode

Use the following command to evaluate BLEU Score of generated images:
python eval_scripts/translate.py --batch_size 50 --pred_dir --data_dir --checkpoint_file --eval_mode

Acknowledgements

The code in this repository has been adapted from the MART, StoryGAN and MirrorGAN codebases.

PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency"

Related tags

Overview

Improving Generation and Evaluation of Visual Stories via Semantic Consistency

Requirements:

Prepare Repository:

Training DuCo-StoryGAN:

Training Evaluation Models:

Inference from DuCo-StoryGAN:

Evaluation:

Acknowledgements

Owner

Adyasha Maharana

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

A Loss Function for Generative Neural Networks Based on Watson’s Perceptual Model

Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

Encoding Causal Macrovariables

OrienMask: Real-time Instance Segmentation with Discriminative Orientation Maps

Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

Code for Robust Contrastive Learning against Noisy Views

Tensorflow implementation of DeepLabv2

This is a five-step framework for the development of intrusion detection systems (IDS) using machine learning (ML) considering model realization, and performance evaluation.

To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

Pytorch implementation of Masked Auto-Encoder

Dynamic Bottleneck for Robust Self-Supervised Exploration

The Face Mask recognition system uses AI technology to detect the person with or without a mask.

PyTorch implementation for the ICLR 2020 paper "Understanding the Limitations of Variational Mutual Information Estimators"

The authors' official PyTorch SigWGAN implementation

The versatile ocean simulator, in pure Python, powered by JAX.

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

Hyperbolic Procrustes Analysis Using Riemannian Geometry

Code for the ICCV 2021 Workshop paper: A Unified Efficient Pyramid Transformer for Semantic Segmentation.

Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC.