A New Approach to Overgenerating and Scoring Abstractive Summaries

Overview

A New Approach to Overgenerating and Scoring Abstractive Summaries

We provide the source code for the paper "A New Approach to Overgenerating and Scoring Abstractive Summaries" accepted at NAACL'21. If you find the code useful, please cite the following paper.

@inproceedings{song2021new, 
    title={A New Approach to Overgenerating and Scoring Abstractive Summaries},
    author={Song, Kaiqiang and Wang, Bingqing and Feng, Zhe and Liu, Fei},
    booktitle={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
    pages={1392--1404},
    year={2021}
}

Presentation Video

Check Our Presentation Video

Demo

Source Input:

The Bank of Japan appealed to financial markets to remain calm Friday following the US decision to order Daiwa Bank Ltd. to close its US operations.

Summaries with varying lengths:

Dependencies

The code is written in Python (v3.7) and Pytorch (v1.7+). We suggest the following enviorment:

HINT: Since huggingface transformers is alternating very fast, you may need to modify a lot of stuff if you want to use a new version. Contact me if you get any trouble on it.

To install pyrouge and transformers, run the command below:

pip install pyrouge transformers==2.3.0

For generating summaries with varying length

Step 1: clone this repo. Download trained Our Model, move it to the working folder and uncompress it.

git clone https://github.com/ucfnlp/varying-length-summ.git
mv model.zip varying-length-summ
cd varying-length-summ
unzip models.zip

Step 2: Generating summaries with varying length from a raw input file.

python run.py --do_test --parallel --input data/input.txt

It will generate summaries of varying lengths coupled with its order information.

For Selecting summaries with best quality binary classifer

Step 1: Follow the previous section about generating summaries with multiple length.

Step 2: Collect test set similar to data/gigaword_cls/test500* files:

  1. a source input file test500_input.txt

  2. a target output file test500_output.txt

  3. a label file test500_label.txt for whether the target summary is admissible for the source input. (all 0 if you don't have thoese labels)

HINT: one instance per line

Step 3: modify the test500 settings in settings/dataset/gigaword_cls.

Step 4: Run the code below.

python run_classifier.py --do_test --parallel

It will generate a prediction of admissible probability in predict.txt.

For Selecting summaries with length reward reranking method

Step 1: Follow the previous section about generating summaries with multiple length.

Step 2: Run the code below.

python run_rerank.py

It will re-rank the summary with length rewards. The predicted length is in length.txt

For Data Downloading (500 inputs x 7 lengths)

Please refer to this link

Owner
Kaiqiang Song
A PhD student of [email protected]. Interest in Machine Learning, Deep Neural Networks, Natural Lang
Kaiqiang Song
[NeurIPS 2021] Code for Unsupervised Learning of Compositional Energy Concepts

Unsupervised Learning of Compositional Energy Concepts This is the pytorch code for the paper Unsupervised Learning of Compositional Energy Concepts.

45 Nov 30, 2022
Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th place solution

Lyft Motion Prediction for Autonomous Vehicles Code for the 4th place solution of Lyft Motion Prediction for Autonomous Vehicles on Kaggle. Discussion

44 Jun 27, 2022
A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

Yingtian Liu 6 Mar 17, 2022
Recognize Handwritten Digits using Deep Learning on the browser itself.

MNIST on the Web An attempt to predict MNIST handwritten digits from my PyTorch model from the browser (client-side) and not from the server, with the

Harjyot Bagga 7 May 28, 2022
PyArmadillo: an alternative approach to linear algebra in Python

PyArmadillo is a linear algebra library for the Python language, with an emphasis on ease of use.

Terry Zhuo 58 Oct 11, 2022
A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis Figure: Shape-Accurate 3D-Aware Image Synthesis. A Shading-Guid

Xingang Pan 115 Dec 18, 2022
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

Tianfei Zhou 239 Dec 26, 2022
TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

TransMVSNet This repository contains the official implementation of the paper: "TransMVSNet: Global Context-aware Multi-view Stereo Network with Trans

旷视研究院 3D 组 155 Dec 29, 2022
BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins

BT-Unet: A-Self-supervised-learning-framework-for-biomedical-image-segmentation-using-Barlow-Twins Deep learning has brought most profound contributio

Narinder Singh Punn 12 Dec 04, 2022
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
pytorch implementation of "Contrastive Multiview Coding", "Momentum Contrast for Unsupervised Visual Representation Learning", and "Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination"

Unofficial implementation: MoCo: Momentum Contrast for Unsupervised Visual Representation Learning (Paper) InsDis: Unsupervised Feature Learning via N

Zhiqiang Shen 16 Nov 04, 2020
Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection

fpn.pytorch Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection Introduction This project inherits the property of our pytorc

Jianwei Yang 912 Dec 21, 2022
Official TensorFlow code for the forthcoming paper

~ Efficient-CapsNet ~ Are you tired of over inflated and overused convolutional neural networks? You're right! It's time for CAPSULES :)

Vittorio Mazzia 203 Jan 08, 2023
Config files for my GitHub profile.

Canalyst Candas Data Science Library Name Canalyst Candas Description Built by a former PM / analyst to give anyone with a little bit of Python knowle

Canalyst Candas 13 Jun 24, 2022
This repository provides code for "On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness".

On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness This repository provides the code for the paper On Interaction B

Meta Research 33 Dec 08, 2022
Contrastive Language-Image Pretraining

CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair

OpenAI 11.5k Jan 08, 2023
Progressive Image Deraining Networks: A Better and Simpler Baseline

Progressive Image Deraining Networks: A Better and Simpler Baseline [arxiv] [pdf] [supp] Introduction This paper provides a better and simpler baselin

190 Dec 01, 2022
Hierarchical Attentive Recurrent Tracking

Hierarchical Attentive Recurrent Tracking This is an official Tensorflow implementation of single object tracking in videos by using hierarchical atte

Adam Kosiorek 147 Aug 07, 2021
Message Passing on Cell Complexes

CW Networks This repository contains the code used for the papers Weisfeiler and Lehman Go Cellular: CW Networks (Under review) and Weisfeiler and Leh

Twitter Research 108 Jan 05, 2023