DUE Evaluator

The repository contains the evaluator covering all of the metrics required by tasks within the DUE Benchmark, i.e., set-based F1 (for KIE), ANLS (used in document VQA), accuracy (including variant used in WTQ), as well as group-based ANLS we proposed for KIE problems with structured output.

Usage

The deval command will be available after the package installation. Every time, it is required to provide input and output files (both in the DU-Schema format) using -o and -r parameters.

Other settings are task-specific and limited to metric (-m) and optional case-insensitiveness (-i). Recommended values of these are:

Dataset	Metric	Case insensitive
DocVQA, InfographicsVQA	ANLS	Yes
Kleister Charity, DeepForm	F1	Yes
PapersWithCode	GROUP-ANLS	Yes
WikiTableQuestions	WTQ	No (handled by metric itself)
TabFact	F1 (obtained value will be equal to Accuracy)	No

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
due_evaluator		due_evaluator
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

due_evaluator

due_evaluator

tests

tests

.coveragerc

.coveragerc

.gitignore

.gitignore

CHANGELOG.md

CHANGELOG.md

MANIFEST.in

MANIFEST.in

README.md

README.md

pyproject.toml

pyproject.toml

requirements.txt

requirements.txt

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

DUE Evaluator

Usage

About

Releases

Packages

Languages

due-benchmark/evaluator

Folders and files

Latest commit

History

Repository files navigation

DUE Evaluator

Usage

About

Resources

Stars

Watchers

Forks

Languages