"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

Overview

undirected-generation-dev

This repo contains the source code of the models described in the following paper

  • "Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021. (paper).

The basic code structure was adapted from the NYU dl4mt-seqgen. We also use the pybleu from fairseq to calculate BLEU scores during the reinforcement learning.

0. Preparation

0.1 Dependencies

  • PyTorch 1.4.0/1.6.0/1.8.0

0.2 Data

The WMT'14 De-En data and the pretrained De-En MLM model are provided in the dl4mt-seqgen.

  • Download WMT'14 De-En valid/test data.
  • Then organize the data in data/ and make sure it follows such a structure:
------ data
--------- de-en
------------ train.de-en.de.pth
------------ train.de-en.en.pth
------------ valid.de-en.de.pth
------------ valid.de-en.en.pth
------------ test.de-en.de.pth
------------ test.de-en.en.pth
  • Download pretrained models.
  • Then organize the pretrained masked language models in models/ make sure it follows such a structure:
------ models
--------- best-valid_en-de_mt_bleu.pth
--------- best-valid_de-en_mt_bleu.pth

2. Training the order policy network with reinforcement learning

Train a policy network to predict the generation order for a pretrained De-En masked language model:

./train_scripts/train_order_rl_deen.sh
  • By defaults, the model checkpoints will be saved in models/learned_order_deen_uniform_4gpu/00_maxlen30_minlen5_bsz32.
  • By using this script, we are only training the model on De-En sentence pairs where both the German and English sentences with a maximum length of 30 and a minimum length of 5. You can change the training parameters max_len and min_len to change the length limits.

3. Decode the undirected generation model with learned orders

  • Set the MODEL_CKPT parameter to the corresponding path found under models/00_maxlen30_minlen5_bsz32. For example:
export MODEL_CKPT=wj8oc8kab4/checkpoint_epoch30+iter96875.pth
  • Evaluate the model on the SCAN MCD1 splits by running:
export MODEL_CKPT=...
./eval_scripts/generate-order-deen.sh $MODEL_CKPT

4. Decode the undirected generation model with heuristic orders

  • Left2Right
./eval_scripts/generate-deen.sh left_right_greedy_1iter
  • Least2Most
./eval_scripts/generate-deen.sh least_most_greedy_1iter
  • EasyFirst
./eval_scripts/generate-deen.sh easy_first_greedy_1iter
  • Uniform
./eval_scripts/generate-deen.sh uniform_greedy_1iter

Citation

@inproceedings{jiang-bansal-2021-learning-analyzing,
    title = "Learning and Analyzing Generation Order for Undirected Sequence Models",
    author = "Jiang, Yichen  and
      Bansal, Mohit",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.298",
    pages = "3513--3523",
}
Owner
Yichen Jiang
Yichen Jiang
Inflated i3d network with inception backbone, weights transfered from tensorflow

I3D models transfered from Tensorflow to PyTorch This repo contains several scripts that allow to transfer the weights from the tensorflow implementat

Yana 479 Dec 08, 2022
Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation

FCN.tensorflow Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (FCNs). The implementation is largely based on the

Sarath Shekkizhar 1.3k Dec 25, 2022
StackNet is a computational, scalable and analytical Meta modelling framework

StackNet This repository contains StackNet Meta modelling methodology (and software) which is part of my work as a PhD Student in the computer science

Marios Michailidis 1.3k Dec 15, 2022
Linescanning - Package for (pre)processing of anatomical and (linescanning) fMRI data

line scanning repository This repository contains all of the tools used during the acquisition and postprocessing of line scanning data at the Spinoza

Jurjen Heij 4 Sep 14, 2022
Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

Hong-Jia Chen 91 Dec 02, 2022
Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation.

DuoRec Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation. Usage Download datasets fr

Qrh 46 Dec 19, 2022
(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"

Res2Net The official pytorch implemention of the paper "Res2Net: A New Multi-scale Backbone Architecture" Our paper is accepted by IEEE Transactions o

Res2Net Applications 928 Dec 29, 2022
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
[Machine Learning Engineer Basic Guide] 부스트캠프 AI Tech - Product Serving 자료

Boostcamp-AI-Tech-Product-Serving 부스트캠프 AI Tech - Product Serving 자료 Repository 구조 part1(MLOps 개론, Model Serving, 머신러닝 프로젝트 라이프 사이클은 별도의 코드가 없으며, part

Sung Yun Byeon 269 Dec 21, 2022
We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will make a program to Crack Any Password Using Python. Show some ❤️ by starring this repository!

Crack Any Password Using Python We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will

Ananya Chatterjee 11 Dec 03, 2022
3 Apr 20, 2022
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022
Methods to get the probability of a changepoint in a time series.

Bayesian Changepoint Detection Methods to get the probability of a changepoint in a time series. Both online and offline methods are available. Read t

Johannes Kulick 554 Dec 30, 2022
BackgroundRemover lets you Remove Background from images and video with a simple command line interface

BackgroundRemover BackgroundRemover is a command line tool to remove background from video and image, made by nadermx to power https://BackgroundRemov

Johnathan Nader 1.7k Dec 30, 2022
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach This is the repo to host the dataset TextSeg and code for TexRNe

SHI Lab 174 Dec 19, 2022
This is a simple framework to make object detection dataset very quickly

FastAnnotation Table of contents General info Requirements Setup General info This is a simple framework to make object detection dataset very quickly

Serena Tetart 1 Jan 24, 2022
The official github repository for Towards Continual Knowledge Learning of Language Models

Towards Continual Knowledge Learning of Language Models This is the official github repository for Towards Continual Knowledge Learning of Language Mo

Joel Jang | 장요엘 65 Jan 07, 2023
Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs

Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs This repository contains code to accompany the paper "Hierarchical Clustering: O

3 Sep 25, 2022
A Dying Light 2 (DL2) PAKFile Utility for Modders and Mod Makers.

Dying Light 2 PAKFile Utility A Dying Light 2 (DL2) PAKFile Utility for Modders and Mod Makers. This tool aims to make PAKFile (.pak files) modding a

RHQ Online 12 Aug 26, 2022
YOLOv5🚀 reproduction by Guo Quanhao using PaddlePaddle

YOLOv5-Paddle YOLOv5 🚀 reproduction by Guo Quanhao using PaddlePaddle 支持AutoBatch 支持AutoAnchor 支持GPU Memory 快速开始 使用AIStudio高性能环境快速构建YOLOv5训练(PaddlePa

QuanHao Guo 20 Nov 14, 2022