NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Last update: Nov 15, 2022

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

This repository contains code and pre-trained models for our NAACL-2022 paper MCSE: Multimodal Contrastive Learning of Sentence Embeddings. If you find this reposity useful, please consider citing our paper.

Contact: Miaoran Zhang ([email protected])

Pre-trained Models & Results

Model	Avg. STS
flickr-mcse-bert-base-uncased [Google Drive]	77.70
flickr-mcse-roberta-base [Google Drive]	78.44
coco-mcse-bert-base-uncased [Google Drive]	77.08
coco-mcse-roberta-base [Google Drive]	78.17

Note: flickr indicates that models are trained on wiki+flickr, and coco indicates that models are trained on wiki+coco.

Quickstart

Setup

Python 3.9.5
Pytorch 1.7.1
Install other packages:

pip install -r requirements.txt

Data Preparation

Please organize the data directory as following:

REPO ROOT
|
|--data    
|  |--wiki1m_for_simcse.txt  
|  |--flickr_random_captions.txt    
|  |--flickr_resnet.hdf5    
|  |--coco_random_captions.txt    
|  |--coco_resnet.hdf5

Wiki1M

wget https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt

Flickr30k & MS-COCO
You can either download the preprocessed data we used:
(annotation sources: flickr30k-entities and coco).

Or preprocess the data by yourself (take Flickr30k as an example):

Download the flickr30k-entities.
Request access to the flickr-images from here. Note that the use of the images much abide by the Flickr Terms of Use.

Run script:

unzip ${path_to_flickr-entities}/annotations.zip

python preprocess/prepare_flickr.py \
    --flickr_entities_dir ${path_to_flickr-entities}  \  
    --flickr_images_dir ${path_to_flickr-images} \
    --output_dir data/
    --batch_size 32

Train & Evaluation

Prepare the senteval datasets for evaluation:

cd SentEval/data/downstream/
bash download_dataset.sh

Run scripts:
```
# For example:  (more examples are given in scripts/.)
sh scripts/run_wiki_flickr.sh
```
Note: In the paper we run experiments with 5 seeds (0,1,2,3,4). You can find the detailed parameter settings in Appendix.

Acknowledgements

The extremely clear and well organized codebase: SimCSE
SentEval toolkit

NAACL 2022: MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Related tags

Overview

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

Pre-trained Models & Results

Quickstart

Setup

Data Preparation

Train & Evaluation

Acknowledgements

Owner

Saarland University Spoken Language Systems Group

Deep learning for NLP crash course at ABBYY.

TTS is a library for advanced Text-to-Speech generation.

Code and data accompanying Natural Language Processing with PyTorch

Label data using HuggingFace's transformers and automatically get a prediction service

GraphNLI: A Graph-based Natural Language Inference Model for Polarity Prediction in Online Debates

Python powered crossword generator with database with 20k+ polish words

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.

This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Coreference resolution for English, German and Polish, optimised for limited training data and easily extensible for further languages

⚖️ A Statutory Article Retrieval Dataset in French.

2021海华AI挑战赛·中文阅读理解·技术组·第三名

The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers", by Machel Reid, Edison Marrese-Taylor, and Yutaka Matsuo

I can help you convert your images to pdf file.

A simple Speech Emotion Recognition (SER) API created using Flask and running in a Docker container.