BARTScore: Evaluating Generated Text as Text Generation

Last update: Dec 17, 2022

Related tags

Deep Learning BARTScore

Overview

This is the Repo for the paper: BARTScore: Evaluating Generated Text as Text Generation

Updates

2021.06.28 Release online evaluation Demo
2021.06.25 Release online Explainable Leaderboard for Meta-evaluation
2021.06.22 Code will be released soon

Background

There is a recent trend that leverages neural models for automated evaluation in different ways, as shown in Fig.1.

(a) Evaluation as matching task. Unsupervised matching metrics aim to measure the semantic equivalence between the reference and hypothesis by using a token-level matching functions in distributed representation space (e.g. BERT) or discrete string space (e.g. ROUGE).

(b) Evaluation as regression task. Regression-based metrics (e.g. BLEURT) introduce a parameterized regression layer, which would be learned in a supervised fashion to accurately predict human judgments.

(c) Evaluation as ranking task. Ranking-based metrics (e.g. COMET) aim to learn a scoring function that assigns a higher score to better hypotheses than to worse ones.

(d) Evaluation as generation task. In this work, we formulate evaluating generated text as a text generation task from pre-trained language models.

BARTScore: Evaluating Generated Text as Text Generation

Related tags

Overview

Updates

Background

Owner

NeuLab

[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

covid question answering datasets and fine tuned models

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

OstrichRL: A Musculoskeletal Ostrich Simulation to Study Bio-mechanical Locomotion.

Machine learning and Deep learning models, deploy on telegram (the best social media)

Computer Vision application in the web

[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Optimizing Value-at-Risk and Conditional Value-at-Risk of Black Box Functions with Lacing Values (LV)

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds (Local-Lip)

Kaggle Feedback Prize - Evaluating Student Writing 15th solution

Mask-invariant Face Recognition through Template-level Knowledge Distillation

Deep Ensemble Learning with Jet-Like architecture

Train the HRNet model on ImageNet

Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network arXiv:1609.04802

IPATool-py: download ipa easily

Using deep learning to predict gene structures of the coding genes in DNA sequences of Arabidopsis thaliana

Source code for CIKM 2021 paper for Relation-aware Heterogeneous Graph for User Profiling