Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Last update: Jan 03, 2023

Overview

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

This is the PyTorch companion code for the paper:

Amaia Salvador, Erhan Gundogdu, Loris Bazzani, and Michael Donoser. Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning. CVPR 2021

If you find this code useful in your research, please consider citing using the following BibTeX entry:

@inproceedings{salvador2021revamping,
    title={Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning},
    author={Salvador, Amaia and Gundogdu, Erhan and Bazzani, Loris and Donoser, Michael},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021}
}

Cloning

This repository uses git-lfs to store model checkpoint files. Make sure to install it before cloning by following the instructions here:

Once installed, model checkpoint files will be automatically downloaded when cloning the repository with:

git clone [email protected]:amzn/image-to-recipe-transformers.git

These files can optionally be ignored by using git lfs install --skip-smudge before cloning the repository, and can be downloaded at any time using git lfs pull.

Installation

Create conda environment: conda env create -f environment.yml
Activate it with conda activate im2recipetransformers

Data preparation

Download & uncompress Recipe1M dataset. The contents of the directory DATASET_PATH should be the following:

layer1.json
layer2.json
train/
val/
test/

The directories train/, val/, and test/ must contain the image files for each split after uncompressing.

Make splits and create vocabulary by running:

python preprocessing.py --root DATASET_PATH

This process will create auxiliary files under DATASET_PATH/traindata, which will be used for training.

Training

Launch training with:

python train.py --model_name model --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints

Tensorboard logging can be enabled with --tensorboard. Then, from the checkpoints directory run:

tensorboard --logdir "./" --port PORT

Run python train.py --help for the full list of available arguments.

Evaluation

Extract features from the trained model for the test set samples of Recipe1M:

python test.py --model_name model --eval_split test --root DATASET_PATH --save_dir /path/to/saved/model/checkpoints

Compute MedR and recall metrics for the extracted feature set:

python eval.py --embeddings_file /path/to/saved/model/checkpoints/model/feats_test.pkl --medr_N 10000

Pretrained models

We provide pretrained model weights under the checkpoints directory. Make sure you run git lfs pull to download the model files.
Extract the zip files. For each model, a folder named MODEL_NAME with two files, args.pkl, and model-best.ckpt is provided.
Extract features for the test set samples of Recipe1M using one of the pretrained models by running:

python test.py --model_name MODEL_NAME --eval_split test --root DATASET_PATH --save_dir ../checkpoints

A file with extracted features will be saved under ../checkpoints/MODEL_NAME.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Related tags

Overview

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Cloning

Installation

Data preparation

Training

Evaluation

Pretrained models

Security

License

Owner

Amazon

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

SDL: Synthetic Document Layout dataset

Deduplication is the task to combine different representations of the same real world entity.

This program do translate english words to portuguese

Türkçe küfürlü içerikleri bulan bir yapay zeka kütüphanesi / An ML library for profanity detection in Turkish sentences

Header-only C++ HNSW implementation with python bindings

Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

A PyTorch implementation of VIOLET

Package for controllable summarization

Large-scale open domain KNOwledge grounded conVERsation system based on PaddlePaddle

构建一个多源（公众号、RSS）、干净、个性化的阅读环境

Malware-Related Sentence Classification

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

MiCECo - Misskey Custom Emoji Counter

Natural Language Processing library built with AllenNLP 🌲🌱

This repository contains the code for running the character-level Sandwich Transformers from our ACL 2020 paper on Improving Transformer Models by Reordering their Sublayers.

A linter to manage all your python exceptions and try/except blocks (limited only for those who like dinosaurs).