Does Pretraining for Summarization Reuqire Knowledge Transfer?

Last update: Dec 19, 2022

Related tags

Overview

Does Pretraining for Summarization Reuqire Knowledge Transfer?

This repository is the official implementation of the work in the paper Does Pretraining for Summarization Reuqire Knowledge Transfer? to appear in Findings of EMNLP 2021.
You can find the paper on arXiv here: https://arxiv.org/abs/2109.04953

Requirements

This code requires Python 3 (tested using version 3.6)

To install requirements, run:

pip install -r requirements.txt

Preparing finetuning datasets

To prepare a summarization dataset for finetuning, run the corresponding script in the finetuning_datasetgen folder. For example, to prepare the cnn-dailymail dataset run:

cd finetuning_datasetgen
python cnndm.py

Running finetuning experiment

We show here how to run training, prediction and evaluation steps for a finetuning experiment. We assume that you have downloaded the pretrained models in the pretrained_models folder from the provided Google Drive link (see pretrained_models/README.md) If you want to pretrain models yourself, see latter part of this readme for the instructions.

All models in our work are trained using allennlp config files which are in .jsonnet format. To run a finetuning experiment, simply run

# for t5-like models
./pipeline_t5.sh 
   
    

# for pointer-generator models
./pipeline_pg.sh

For example, for finetuning a T5 model on cnndailymail dataset, starting from a model pretrained with ourtasks-nonsense pretraining dataset, run

./pipeline_t5.sh finetuning_experiments/cnndm/t5-ourtasks-nonsense

Similarly, for finetuning a randomly-initialized pointer-generator model, run

./pipeline_pg.sh finetuning_experiments/cnndm/pg-randominit

The trained model and output files would be available in the folder that would be created by the script.

model.tar.gz contains the trained (finetuned) model

test_outputs.jsonl contains the outputs of the model on the test split.

test_genmetrics.json contains the ROUGE scores of the output

Creating pretraining datasets

We have provided the nonsense pretraining datasets used in our work via Google Drive (see dataset_root/pretraining_datasets/README.md for instructions)

However, if you want to generate your own pretraining corpus, you can run

cd pretraining_datasetgen
# for generating dataset using pretraining tasks
python ourtasks.py
# for generating dataset using STEP pretraining tasks
python steptasks.py

These commands would create pretraining datasets using nonsense. If you want to create datasets starting from wikipedia documents please look into the two scripts which guide you how to do that by commenting/uncommenting two blocks of code.

Pretraining models

Although we provide you the pretrained model checkpoints via GoogleDrive, if you want to pretrain your own models, you can do that by using the corresponding pretraining config file. As an example, we have provided a config file which pretrains on ourtasks-nonsense dataset. Make sure that the pretraining dataset files exist (either created by you or downloaded from GoogleDrive) before running the pretraining command. The pretraining is also done using the same shell scripts used for the finetuning experiments. For example, to pretrain a model on the ourtasks-nonsense dataset, simply run :

./pipeline_t5.sh pretraining_experiments/pretraining_t5_ourtasks_nonsense

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Related tags

Overview

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Requirements

Preparing finetuning datasets

Running finetuning experiment

Creating pretraining datasets

Pretraining models

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

MoveNet Single Pose on DepthAI

PyTorch implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

Nb workflows - A workflow platform which allows you to run parameterized notebooks programmatically

Using Streamlit to host a multi-page tool with model specs and classification metrics, while also accepting user input values for prediction.

Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)"

NuPIC Studio is an all-in-one tool that allows users create a HTM neural network from scratch

Posterior predictive distributions quantify uncertainties ignored by point estimates.

A program that can analyze videos according to the weights you select

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

SFD implement with pytorch

I-SECRET: Importance-guided fundus image enhancement via semi-supervised contrastive constraining

The official implementation of Equalization Loss for Long-Tailed Object Recognition (CVPR 2020) based on Detectron2

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Related tags

Overview

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Requirements

Preparing finetuning datasets

Running finetuning experiment

Creating pretraining datasets

Pretraining models

Owner

Approximately Correct Machine Intelligence (ACMI) Lab

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

MoveNet Single Pose on DepthAI

PyTorch implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

Nb workflows - A workflow platform which allows you to run parameterized notebooks programmatically

Using Streamlit to host a multi-page tool with model specs and classification metrics, while also accepting user input values for prediction.

Official PyTorch code for "BAM: Bottleneck Attention Module (BMVC2018)" and "CBAM: Convolutional Block Attention Module (ECCV2018)"

NuPIC Studio is an all­-in-­one tool that allows users create a HTM neural network from scratch

Posterior predictive distributions quantify uncertainties ignored by point estimates.

A program that can analyze videos according to the weights you select

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

🏎️ Accelerate training and inference of 🤗 Transformers with easy to use hardware optimization tools

Accompanying code for the paper "A Kernel Test for Causal Association via Noise Contrastive Backdoor Adjustment".

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

SFD implement with pytorch

I-SECRET: Importance-guided fundus image enhancement via semi-supervised contrastive constraining

The official implementation of Equalization Loss for Long-Tailed Object Recognition (CVPR 2020) based on Detectron2

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

NuPIC Studio is an all-in-one tool that allows users create a HTM neural network from scratch