TANL: Structured Prediction as Translation between Augmented Natural Languages

Related tags

Deep Learningtanl
Overview

TANL: Structured Prediction as Translation between Augmented Natural Languages

Code for the paper "Structured Prediction as Translation between Augmented Natural Languages" (ICLR 2021).

If you use this code, please cite the paper using the bibtex reference below.

@inproceedings{tanl,
    title={Structured Prediction as Translation between Augmented Natural Languages},
    author={Giovanni Paolini and Ben Athiwaratkun and Jason Krone and Jie Ma and Alessandro Achille and Rishita Anubhai and Cicero Nogueira dos Santos and Bing Xiang and Stefano Soatto},
    booktitle={9th International Conference on Learning Representations, {ICLR} 2021},
    year={2021},
}

Requirements

  • Python 3.6+
  • PyTorch (tested with version 1.7.1)
  • Transformers (tested with version 4.0.0)
  • NetworkX (tested with version 2.5, only used in coreference resolution)

You can install all required Python packages with pip install -r requirements.txt

Datasets

By default, datasets are expected to be in data/DATASET_NAME. Dataset-specific code is in datasets.py.

For example, the CoNLL04 and ADE datasets (joint entity and relation extraction) in the correct format can be downloaded using https://github.com/markus-eberts/spert/blob/master/scripts/fetch_datasets.sh. For other datasets, pre-processing and links are documented in the code.

Running the code

Use the following command: python run.py JOB

The JOB argument refers to a section of the config file, which by default is config.ini. A sample config file is provided, with settings that allow for a faster training and less memory usage than the settings used to obtain the final results in the paper.

For example, to replicate the paper's results on CoNLL04, have the following section in the config file:

[conll04_final]
datasets = conll04
model_name_or_path = t5-base
num_train_epochs = 200
max_seq_length = 256
max_seq_length_eval = 512
train_split = train,dev
per_device_train_batch_size = 8
per_device_eval_batch_size = 16
do_train = True
do_eval = False
do_predict = True
episodes = 1-10
num_beams = 8

Then run python run.py conll04_final. Note that the final results will differ slightly from the ones reported in the paper, due to small code changes and randomness.

Config arguments can be overwritten by command line arguments. For example: python run.py conll04_final --num_train_epochs 50.

Additional details

If do_train = True, the model is trained on the given train split (e.g., 'train') of the given datasets. The final weights and intermediate checkpoints are written in a directory such as experiments/conll04_final-t5-base-ep200-len256-b8-train, with one subdirectory per episode. Results in JSON format are also going to be saved there.

In every episode, the model is trained on a different (random) permutation of the training set. The random seed is given by the episode number, so that every episode always produces the same exact model.

Once a model is trained, it is possible to evaluate it without training again. For this, set do_train = False or (more easily) provide the -e command-line argument: python run.py conll04_final -e.

If do_eval = True, the model is evaluated on the 'dev' split. If do_predict = True, the model is evaluated on the 'test' split.

Arguments

The following are the most important command-line arguments for the run.py script. Run python run.py -h for the full list.

  • -c CONFIG_FILE: specify config file to use (default is config.ini)
  • -e: only run evaluation (overwrites the setting do_train in the config file)
  • -a: evaluate also intermediate checkpoints, in addition to the final model
  • -v : print results for each evaluation run
  • -g GPU: specify which GPU to use for evaluation

The following are the most important arguments for the config file. See the sample config file to understand the format.

  • datasets (str): comma-separated list of datasets for training
  • eval_datasets (str): comma-separated list of datasets for evaluation (default is the same as for training)
  • model_name_or_path (str): path to pretrained model or model identifier from huggingface.co/models (e.g. t5-base)
  • do_train (bool): whether to run training (default is False)
  • do_eval (bool): whether to run evaluation on the dev set (default is False)
  • do_predict (bool): whether to run evaluation on the test set (default is False)
  • train_split (str): comma-separated list of data splits for training (default is train)
  • num_train_epochs (int): number of train epochs
  • learning_rate (float): initial learning rate (default is 5e-4)
  • train_subset (float > 0 and <=1): portion of training data to effectively use during training (default is 1, i.e., use all training data)
  • per_device_train_batch_size (int): batch size per GPU during training (default is 8)
  • per_device_eval_batch_size (int): batch size during evaluation (default is 8; only one GPU is used for evaluation)
  • max_seq_length (int): maximum input sequence length after tokenization; longer sequences are truncated
  • max_output_seq_length (int): maximum output sequence length (default is max_seq_length)
  • max_seq_length_eval (int): maximum input sequence length for evaluation (default is max_seq_length)
  • max_output_seq_length_eval (int): maximum output sequence length for evaluation (default is max_output_seq_length or max_seq_length_eval or max_seq_length)
  • episodes (str): episodes to run (default is 0; an interval can be specified, such as 1-4; the episode number is used as the random seed)
  • num_beams (int): number of beams for beam search during generation (default is 1)
  • multitask (bool): if True, the name of the dataset is prepended to each input sentence (default is False)

See arguments.py and transformers.TrainingArguments for additional config arguments.

Codes for 'Dual Parameterization of Sparse Variational Gaussian Processes'

Dual Parameterization of Sparse Variational Gaussian Processes Documentation | Notebooks | API reference Introduction This repository is the official

AaltoML 7 Dec 23, 2022
An interactive DNN Model deployed on web that predicts the chance of heart failure for a patient with an accuracy of 98%

Heart Failure Predictor About A Web UI deployed Dense Neural Network Model Made using Tensorflow that predicts whether the patient is healthy or has c

Adit Ahmedabadi 0 Jan 09, 2022
The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

This is a curated list of tutorials, projects, libraries, videos, papers, books and anything related to the incredible PyTorch. Feel free to make a pu

Ritchie Ng 9.2k Jan 02, 2023
PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Ubisoft 76 Dec 30, 2022
YouRefIt: Embodied Reference Understanding with Language and Gesture

YouRefIt: Embodied Reference Understanding with Language and Gesture YouRefIt: Embodied Reference Understanding with Language and Gesture by Yixin Che

16 Jul 11, 2022
A Python wrapper for Google Tesseract

Python Tesseract Python-tesseract is an optical character recognition (OCR) tool for python. That is, it will recognize and "read" the text embedded i

Matthias A Lee 4.6k Jan 05, 2023
Repository of continual learning papers

Continual learning paper repository This repository contains an incomplete (but dynamically updated) list of papers exploring continual learning in ma

29 Jan 05, 2023
For the paper entitled ''A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining''

Summary This is the source code for the paper "A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining", which was accepted as fu

1 Nov 10, 2021
Implementation of average- and worst-case robust flatness measures for adversarial training.

Relating Adversarially Robust Generalization to Flat Minima This repository contains code corresponding to the MLSys'21 paper: D. Stutz, M. Hein, B. S

David Stutz 13 Nov 27, 2022
Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors,

24 Oct 26, 2022
The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

Watermark-Robustness-Toolbox - Official PyTorch Implementation This repository contains the official PyTorch implementation of the following paper to

49 Dec 19, 2022
PlaidML is a framework for making deep learning work everywhere.

A platform for making deep learning work everywhere. Documentation | Installation Instructions | Building PlaidML | Contributing | Troubleshooting | R

PlaidML 4.5k Jan 02, 2023
This is the latest version of the PULP SDK

PULP-SDK This is the latest version of the PULP SDK, which is under active development. The previous (now legacy) version, which is no longer supporte

78 Dec 07, 2022
Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate

News 05/17/2021 To make the comparison on ZJU-MoCap easier, we save quantitative and qualitative results of other methods at here, including Neural Vo

ZJU3DV 748 Jan 07, 2023
Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

AnimeGAN - Deep Convolutional Generative Adverserial Network PyTorch implementation of DCGAN introduced in the paper: Unsupervised Representation Lear

Rohit Kukreja 23 Jul 21, 2022
Exploiting a Zoo of Checkpoints for Unseen Tasks

Exploiting a Zoo of Checkpoints for Unseen Tasks This repo includes code to reproduce all results in the above Neurips paper, authored by Jiaji Huang,

Baidu Research 8 Sep 06, 2022
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

118 Dec 12, 2022
This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation) Usage example python dynamic_inverted_softmax.py --sims_train

36 Dec 29, 2022
Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

GUI for iVOS(interactive VOS) and GIS (Guided iVOS) GUI Implementation of CVPR2021 paper "Guided Interactive Video Object Segmentation Using Reliabili

Yuk Heo 13 Dec 09, 2022
Multiple Object Tracking with Yolov5!

Tracking with yolov5 This implementation is for who need to tracking multi-object only with detector. You can easily track mult-object with your well

9 Nov 08, 2022