A New Approach to Overgenerating and Scoring Abstractive Summaries

Overview

A New Approach to Overgenerating and Scoring Abstractive Summaries

We provide the source code for the paper "A New Approach to Overgenerating and Scoring Abstractive Summaries" accepted at NAACL'21. If you find the code useful, please cite the following paper.

@inproceedings{song2021new, 
    title={A New Approach to Overgenerating and Scoring Abstractive Summaries},
    author={Song, Kaiqiang and Wang, Bingqing and Feng, Zhe and Liu, Fei},
    booktitle={Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
    pages={1392--1404},
    year={2021}
}

Presentation Video

Check Our Presentation Video

Demo

Source Input:

The Bank of Japan appealed to financial markets to remain calm Friday following the US decision to order Daiwa Bank Ltd. to close its US operations.

Summaries with varying lengths:

Dependencies

The code is written in Python (v3.7) and Pytorch (v1.7+). We suggest the following enviorment:

HINT: Since huggingface transformers is alternating very fast, you may need to modify a lot of stuff if you want to use a new version. Contact me if you get any trouble on it.

To install pyrouge and transformers, run the command below:

pip install pyrouge transformers==2.3.0

For generating summaries with varying length

Step 1: clone this repo. Download trained Our Model, move it to the working folder and uncompress it.

git clone https://github.com/ucfnlp/varying-length-summ.git
mv model.zip varying-length-summ
cd varying-length-summ
unzip models.zip

Step 2: Generating summaries with varying length from a raw input file.

python run.py --do_test --parallel --input data/input.txt

It will generate summaries of varying lengths coupled with its order information.

For Selecting summaries with best quality binary classifer

Step 1: Follow the previous section about generating summaries with multiple length.

Step 2: Collect test set similar to data/gigaword_cls/test500* files:

  1. a source input file test500_input.txt

  2. a target output file test500_output.txt

  3. a label file test500_label.txt for whether the target summary is admissible for the source input. (all 0 if you don't have thoese labels)

HINT: one instance per line

Step 3: modify the test500 settings in settings/dataset/gigaword_cls.

Step 4: Run the code below.

python run_classifier.py --do_test --parallel

It will generate a prediction of admissible probability in predict.txt.

For Selecting summaries with length reward reranking method

Step 1: Follow the previous section about generating summaries with multiple length.

Step 2: Run the code below.

python run_rerank.py

It will re-rank the summary with length rewards. The predicted length is in length.txt

For Data Downloading (500 inputs x 7 lengths)

Please refer to this link

Owner
Kaiqiang Song
A PhD student of [email protected]. Interest in Machine Learning, Deep Neural Networks, Natural Lang
Kaiqiang Song
PyTorch implementation HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections

HoroPCA This code is the official PyTorch implementation of the ICML 2021 paper: HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projec

HazyResearch 52 Nov 14, 2022
All the code and files related to the MI-Lab of UE19CS305 course in sem 5

Machine-Intelligence-Lab-CS305 The compilation of all the code an drelated files from MI-Lab UE19CS305 (of batch 2019-2023) offered by PES University

Arvind Krishna 3 Nov 10, 2022
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
Segmentation vgg16 fcn - cityscapes

VGGSegmentation Segmentation vgg16 fcn - cityscapes Priprema skupa skripta prepare_dataset_downsampled.py Iz slika cityscapesa izrezuje haubu automobi

6 Oct 24, 2020
Implementation of the federated dual coordinate descent (FedDCD) method.

FedDCD.jl Implementation of the federated dual coordinate descent (FedDCD) method. Installation To install, just call Pkg.add("https://github.com/Zhen

Zhenan Fan 6 Sep 21, 2022
IDM: An Intermediate Domain Module for Domain Adaptive Person Re-ID,

Intermediate Domain Module (IDM) This repository is the official implementation for IDM: An Intermediate Domain Module for Domain Adaptive Person Re-I

Yongxing Dai 87 Nov 22, 2022
RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds

RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds This repository contains the code asscoiated

Felix Hensel 14 Dec 12, 2022
This is an official implementation for "PlaneRecNet".

PlaneRecNet This is an official implementation for PlaneRecNet: A multi-task convolutional neural network provides instance segmentation for piece-wis

yaxu 50 Nov 17, 2022
DSL for matching Python ASTs

py-ast-rule-engine This library provides a DSL (domain-specific language) to match a pattern inside a Python AST (abstract syntax tree). The library i

1 Dec 18, 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Segmentation Transformer Implementation of Segmentation Transformer in PyTorch, a new model to achieve SOTA in semantic segmentation while using trans

Abhay Gupta 161 Dec 08, 2022
Fairness Metrics: All you need to know

Fairness Metrics: All you need to know Testing machine learning software for ethical bias has become a pressing current concern. Recent research has p

Anonymous2020 1 Jan 17, 2022
Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.

SARS-CoV-2 processing requests Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide. Prerequisites This autom

useGalaxy.eu 17 Aug 13, 2022
OCR Post Correction for Endangered Language Texts

📌 Coming soon: an update to the software including features from our paper on semi-supervised OCR post-correction, to be published in the Transaction

Shruti Rijhwani 96 Dec 31, 2022
Software associated to AAAI paper "Planning with Biological Neurons and Synapses"

jBrain Software associated with the AAAI 2022 paper Francesco D'Amore, Daniel Mitropolsky, Pierluigi Crescenzi, Emanuele Natale, Christos H. Papadimit

Pierluigi Crescenzi 1 Apr 10, 2022
AI Virtual Calculator: This is a simple virtual calculator based on Artificial intelligence.

AI Virtual Calculator: This is a simple virtual calculator that works with gestures using OpenCV. We will use our hand in the air to click on the calc

Md. Rakibul Islam 1 Jan 13, 2022
A distributed deep learning framework that supports flexible parallelization strategies.

FlexFlow FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization stra

528 Dec 25, 2022
Tensorflow implementation and notebooks for Implicit Maximum Likelihood Estimation

tf-imle Tensorflow 2 and PyTorch implementation and Jupyter notebooks for Implicit Maximum Likelihood Estimation (I-MLE) proposed in the NeurIPS 2021

NEC Laboratories Europe 69 Dec 13, 2022
Functional deep learning

Pipeline abstractions for deep learning. Full documentation here: https://lf1-io.github.io/padl/ PADL: is a pipeline builder for PyTorch. may be used

LF1 101 Nov 09, 2022
Project page for our ICCV 2021 paper "The Way to my Heart is through Contrastive Learning"

The Way to my Heart is through Contrastive Learning: Remote Photoplethysmography from Unlabelled Video This is the official project page of our ICCV 2

36 Jan 06, 2023
State-of-the-art language models can match human performance on many tasks

Status: Archive (code is provided as-is, no updates expected) Grade School Math [Blog Post] [Paper] State-of-the-art language models can match human p

OpenAI 259 Jan 08, 2023