BARTScore: Evaluating Generated Text as Text Generation

Last update: Dec 17, 2022

Related tags

Deep Learning BARTScore

Overview

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation

Updates

2021.06.28 Release online evaluation Demo
2021.06.25 Release online Explainable Leaderboard for Meta-evaluation
2021.06.22 Code will be released soon

Background

There is a recent trend that leverages neural models for automated evaluation in different ways, as shown in Fig.1.

(a) Evaluation as matching task. Unsupervised matching metrics aim to measure the semantic equivalence between the reference and hypothesis by using a token-level matching functions in distributed representation space (e.g. BERT) or discrete string space (e.g. ROUGE).

(b) Evaluation as regression task. Regression-based metrics (e.g. BLEURT) introduce a parameterized regression layer, which would be learned in a supervised fashion to accurately predict human judgments.

(c) Evaluation as ranking task. Ranking-based metrics (e.g. COMET) aim to learn a scoring function that assigns a higher score to better hypotheses than to worse ones.

(d) Evaluation as generation task. In this work, we formulate evaluating generated text as a text generation task from pre-trained language models.

BARTScore: Evaluating Generated Text as Text Generation

Related tags

Overview

Updates

Background

Owner

NeuLab

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

An Efficient Training Approach for Very Large Scale Face Recognition or F²C for simplicity.

A toy compiler that can convert Python scripts to pickle bytecode 🥒

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

mbrl-lib is a toolbox for facilitating development of Model-Based Reinforcement Learning algorithms.

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

Repository for publicly available deep learning models developed in Rosetta community

A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines

Geometric Deep Learning Extension Library for PyTorch

Wanli Li and Tieyun Qian: Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction, IJCNN 2021

code for generating data set ES-ImageNet with corresponding training code

Extracts data from the database for a graph-node and stores it in parquet files

Code release for "COTR: Correspondence Transformer for Matching Across Images"

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Constructing interpretable quadratic accuracy predictors to serve as an objective function for an IQCQP problem that represents NAS under latency constraints and solve it with efficient algorithms.

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)