WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Last update: Oct 28, 2022

Related tags

Overview

Paper "Improving image captioning with better use of captions"

@inproceedings{shi2020improving,
  title={Improving Image Captioning with Better Use of Caption},
  author={Shi, Zhan and Zhou, Xu and Qiu, Xipeng and Zhu, Xiaodan},
  booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
  pages={7454--7464},
  year={2020}
}

Requirements

python 2.7.15

torch 1.0.1

Specific conda env is shown in ezs.yml

BTW, you need to download coco-captions and cider folder in this directory for evaluation.

Data Files and Models

Files: Add files in data directory in google drive or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) to data directory here. See data/README for more details.

Models: Add log directory in google drive or or [baidu netdisk](链接：https://pan.baidu.com/s/1ddtfdlwD65cm4JmVu6GF3w 提取码：39pa) here.

Scripts

MLE training:

python train.py --gpus 0 --id experiment-mle

RL training

python train.py --gpus 0 --id experiment-rl --learning_rate 2e-5 --resume_from experiment-mle --resume_from_best True --self_critical_after 0 --max_epochs 60 --learning_rate_decay_start -1 --scheduled_sampling_start -1 --reduce_on_plateau

Evaluate your own model or Load trained model:

python eval.py --gpus 0 --resume_from experiment-mle

and

python eval.py --gpus 0 --resume_from experiment-rl

Acknowledgement

This code is based on Ruotian Luo's brilliant image captioning repo ruotianluo/self-critical.pytorch. We use the detected bounding boxes/categories/features provided by Bottom-Up peteanderson80/bottom-up-attention, yangxuntu/SGAE. Many thanks for their work!

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Related tags

Overview

Paper "Improving image captioning with better use of captions"

Requirements

Data Files and Models

Scripts

Acknowledgement

Owner

Code for Neural-GIF: Neural Generalized Implicit Functions for Animating People in Clothing(ICCV21)

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

[SDM 2022] Towards Similarity-Aware Time-Series Classification

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

Auto-Lama combines object detection and image inpainting to automate object removals

PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

Cleaned test data list of DukeMTMC-reID, ICCV2021

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

House_prices_kaggle - Predict sales prices and practice feature engineering, RFs, and gradient boosting

CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

An interactive DNN Model deployed on web that predicts the chance of heart failure for a patient with an accuracy of 98%

Image Captioning using CNN and Transformers

clustering moroccan stocks time series data using k-means with dtw (dynamic time warping)

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

DCA - Official Python implementation of Delaunay Component Analysis algorithm

3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.