WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Last update: Oct 28, 2022

Related tags

Overview

Paper "Improving image captioning with better use of captions"

@inproceedings{shi2020improving,
  title={Improving Image Captioning with Better Use of Caption},
  author={Shi, Zhan and Zhou, Xu and Qiu, Xipeng and Zhu, Xiaodan},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  pages={7454--7464},
  year={2020}
}

Requirements

python 2.7.15

torch 1.0.1

Specific conda env is shown in ezs.yml

BTW, you need to download coco-captions and cider folder in this directory for evaluation.

Data Files and Models

Files: Add files in data directory in google drive or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) to data directory here. See data/README for more details.

Models: Add log directory in google drive or or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) here.

Scripts

MLE training:

python train.py --gpus 0 --id experiment-mle

RL training

python train.py --gpus 0 --id experiment-rl --learning_rate 2e-5 --resume_from experiment-mle --resume_from_best True --self_critical_after 0 --max_epochs 60 --learning_rate_decay_start -1 --scheduled_sampling_start -1 --reduce_on_plateau

Evaluate your own model or Load trained model:

python eval.py --gpus 0 --resume_from experiment-mle

and

python eval.py --gpus 0 --resume_from experiment-rl

Acknowledgement

This code is based on Ruotian Luo's brilliant image captioning repo ruotianluo/self-critical.pytorch. We use the detected bounding boxes/categories/features provided by Bottom-Up peteanderson80/bottom-up-attention, yangxuntu/SGAE. Many thanks for their work!

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Related tags

Overview

Paper "Improving image captioning with better use of captions"

Requirements

Data Files and Models

Scripts

Acknowledgement

Owner

Multispectral Object Detection with Yolov5

Data Preparation, Processing, and Visualization for MoVi Data

Rede Neural Convolucional feita durante o processo seletivo do Laboratório de Inteligência Artificial da FACOM (UFMS)

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

A deep learning tabular classification architecture inspired by TabTransformer with integrated gated multilayer perceptron.

Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

[ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression

Pyramid Scene Parsing Network, CVPR2017.

Efficient-GlobalPointer - Pytorch Efficient GlobalPointer

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

Deep Surface Reconstruction from Point Clouds with Visibility Information

Easy to use Audio Tagging in PyTorch

3D ResNets for Action Recognition (CVPR 2018)

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

Convenient tool for speeding up the intern/officer review process.

PyTorch reimplementation of minimal-hand (CVPR2020)