An unreferenced image captioning metric (ACL-21)

Last update: Nov 20, 2022

Related tags

Overview

UMIC

This repository provides an unferenced image captioning metric from our ACL 2021 paper UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning.
Here, we provide the code to compute UMIC.

Usage (Updating the Descriptions)

Our code is based on UNITER. Therefore, please follow the install guideline for using Docker to load UNITER. In the next few weeks, we try to release the version without using the docker.

1. Install Prerequisites

We used the Docker image provided by the official repo of UNITER. Using the guideline in the repo, please install the docker.

2. Download the Visual Features

For image captioning task, COCO dataset is widely used. To download the visual features for coco captions, just download the image features for coco validation splits using the following command.

wget https://acvrpublicycchen.blob.core.windows.net/uniter/img_db/coco_val2014.tar

Please refer to the offical repo of UNITER for downloading other visual features.

3. Pre-processing the Textual Features (Captions)

The format of textual feature file(python dictionary, json format) is as follows:
'cands' : [list of candidate captions]
'img_fs' : [list of image file names]

4. Running the Script

Launching Docker

source launch_activate.sh $PATH_TO_STORAGE

Compute Score

python compute_score.py --data_type capeval1k \
                              --ckpt /storage/umic.pt \
                              --img_type \ coco_val2014 \

Reference

If you find this repo useful, please consider citing:

@inproceedings{lee-etal-2021-umic,
    title = "{UMIC}: An Unreferenced Metric for Image Captioning via Contrastive Learning",
    author = "Lee, Hwanhee  and
      Yoon, Seunghyun  and
      Dernoncourt, Franck  and
      Bui, Trung  and
      Jung, Kyomin",
    booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.acl-short.29",
    doi = "10.18653/v1/2021.acl-short.29",
    pages = "220--226",
}

An unreferenced image captioning metric (ACL-21)

Related tags

Overview

UMIC

Usage (Updating the Descriptions)

1. Install Prerequisites

2. Download the Visual Features

3. Pre-processing the Textual Features (Captions)

4. Running the Script

Reference

Owner

hwanheelee

A collection of semantic image segmentation models implemented in TensorFlow

Code for the upcoming CVPR 2021 paper

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

CellRank's reproducibility repository.

TensorFlow implementation of AlexNet and its training and testing on ImageNet ILSVRC 2012 dataset

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation (CVPR 2021)

POCO: Point Convolution for Surface Reconstruction

A machine learning package for streaming data in Python. The other ancestor of River.

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

Official repository for Hierarchical Opacity Propagation for Image Matting

PyTorch implementation of TSception V2 using DEAP dataset

Cross-platform-profile-pic-changer - Script to change profile pictures across multiple platforms

Code base for NeurIPS 2021 publication titled Kernel Functional Optimisation (KFO)

This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data.

Best practices for segmentation of the corporate network of any company

METS/ALTO OCR enhancing tool by the National Library of Luxembourg (BnL)

FastyAPI is a Stack boilerplate optimised for heavy loads.

A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

efficient neural audio synthesis in the waveform domain