"Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021

Overview

undirected-generation-dev

This repo contains the source code of the models described in the following paper

  • "Learning and Analyzing Generation Order for Undirected Sequence Models" in Findings of EMNLP, 2021. (paper).

The basic code structure was adapted from the NYU dl4mt-seqgen. We also use the pybleu from fairseq to calculate BLEU scores during the reinforcement learning.

0. Preparation

0.1 Dependencies

  • PyTorch 1.4.0/1.6.0/1.8.0

0.2 Data

The WMT'14 De-En data and the pretrained De-En MLM model are provided in the dl4mt-seqgen.

  • Download WMT'14 De-En valid/test data.
  • Then organize the data in data/ and make sure it follows such a structure:
------ data
--------- de-en
------------ train.de-en.de.pth
------------ train.de-en.en.pth
------------ valid.de-en.de.pth
------------ valid.de-en.en.pth
------------ test.de-en.de.pth
------------ test.de-en.en.pth
  • Download pretrained models.
  • Then organize the pretrained masked language models in models/ make sure it follows such a structure:
------ models
--------- best-valid_en-de_mt_bleu.pth
--------- best-valid_de-en_mt_bleu.pth

2. Training the order policy network with reinforcement learning

Train a policy network to predict the generation order for a pretrained De-En masked language model:

./train_scripts/train_order_rl_deen.sh
  • By defaults, the model checkpoints will be saved in models/learned_order_deen_uniform_4gpu/00_maxlen30_minlen5_bsz32.
  • By using this script, we are only training the model on De-En sentence pairs where both the German and English sentences with a maximum length of 30 and a minimum length of 5. You can change the training parameters max_len and min_len to change the length limits.

3. Decode the undirected generation model with learned orders

  • Set the MODEL_CKPT parameter to the corresponding path found under models/00_maxlen30_minlen5_bsz32. For example:
export MODEL_CKPT=wj8oc8kab4/checkpoint_epoch30+iter96875.pth
  • Evaluate the model on the SCAN MCD1 splits by running:
export MODEL_CKPT=...
./eval_scripts/generate-order-deen.sh $MODEL_CKPT

4. Decode the undirected generation model with heuristic orders

  • Left2Right
./eval_scripts/generate-deen.sh left_right_greedy_1iter
  • Least2Most
./eval_scripts/generate-deen.sh least_most_greedy_1iter
  • EasyFirst
./eval_scripts/generate-deen.sh easy_first_greedy_1iter
  • Uniform
./eval_scripts/generate-deen.sh uniform_greedy_1iter

Citation

@inproceedings{jiang-bansal-2021-learning-analyzing,
    title = "Learning and Analyzing Generation Order for Undirected Sequence Models",
    author = "Jiang, Yichen  and
      Bansal, Mohit",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.298",
    pages = "3513--3523",
}
Owner
Yichen Jiang
Yichen Jiang
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab 398 Dec 30, 2022
A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

Wenhao Hu 94 Jan 06, 2023
This is a demo app to be used in the video streaming applications

MoViDNN: A Mobile Platform for Evaluating Video Quality Enhancement with Deep Neural Networks MoViDNN is an Android application that can be used to ev

ATHENA Christian Doppler (CD) Laboratory 7 Jul 21, 2022
NaturalProofs: Mathematical Theorem Proving in Natural Language

NaturalProofs: Mathematical Theorem Proving in Natural Language NaturalProofs: Mathematical Theorem Proving in Natural Language Sean Welleck, Jiacheng

Sean Welleck 83 Jan 05, 2023
EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration Ruikang Xu, Zeyu Xiao, Jie Huang, Yueyi Zhang, Zhiwei Xiong. EDPN: Enhanced Deep Pyra

69 Dec 15, 2022
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 07, 2022
Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Trainable multi-codebook quantization This repository implements a utility for use with PyTorch, and ideally GPUs, for training an efficient quantizer

Daniel Povey 41 Jan 07, 2023
buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

buildseg buildseg is a Building Extraction plugin for QGIS based on PaddlePaddle. How to use Download and install QGIS and clone the repo : git clone

39 Dec 09, 2022
Repository for MDPGT

MD-PGT Repository for implementing and reproducing the results for the paper MDPGT: Momentum-based Decentralized Policy Gradient Tracking. Available E

Xian Yeow Lee 2 Dec 30, 2021
HEAM: High-Efficiency Approximate Multiplier Optimization for Deep Neural Networks

Approximate Multiplier by HEAM What's HEAM? HEAM is a general optimization method to generate high-efficiency approximate multipliers for specific app

4 Sep 11, 2022
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion. NÜWA is a unified multimodal p

Microsoft 2.6k Jan 06, 2023
Control-Robot-Arm-using-PS4-Controller - A Robotic Arm based on Raspberry Pi and Arduino that controlled by PS4 Controller

Control-Robot-Arm-using-PS4-Controller You can see all details about this Robot

MohammadReza Sharifi 5 Jan 01, 2022
Lab Materials for MIT 6.S191: Introduction to Deep Learning

This repository contains all of the code and software labs for MIT 6.S191: Introduction to Deep Learning! All lecture slides and videos are available

Alexander Amini 5.6k Dec 26, 2022
An open source Jetson Nano baseboard and tools to design your own.

My Jetson Nano Baseboard This basic baseboard gives the user the foundation and the flexibility to design their own baseboard for the Jetson Nano. It

NVIDIA AI IOT 57 Dec 29, 2022
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

185 Dec 26, 2022
Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees" Installa

0 Oct 13, 2021
Over-the-Air Ensemble Inference with Model Privacy

Over-the-Air Ensemble Inference with Model Privacy This repository contains simulations for our private ensemble inference method. Installation Instal

Selim Firat Yilmaz 1 Jun 29, 2022
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

Adelaide Intelligent Machines (AIM) Group 7 Sep 12, 2022
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternativ

9 Oct 18, 2022
Repository for code and dataset for our EMNLP 2021 paper - “So You Think You’re Funny?”: Rating the Humour Quotient in Standup Comedy.

AI-OpenMic Dataset The dataset is available for download via the follwing link. Repository for code and dataset for our EMNLP 2021 paper - “So You Thi

6 Oct 26, 2022