DUE: End-to-End Document Understanding Benchmark

Overview

This is the repository that provide tools to download data, reproduce the baseline results and evaluation.

What can you achieve with this guide

Based on this repository, you may be able to:

  1. download data for benchmark in a unified format.
  2. run all the baselines.
  3. evaluate already trained baseline models.

Install benchmark-related repositories

Start the container:

sudo userdocker run nvcr.io/nvidia/pytorch:20.12-py3

Clone the repo with:

git clone [email protected]:due-benchmark/baselines.git

Install the requirements:

pip install -e .

1. Download datasets and the base model

The datasets are re-hosted on the https://duebenchmark.com/data and can be downloaded from there. Moreover, since the baselines are finetuned based on the T5 model, you need to download the original model. Again it is re-hosted at https://duebenchmark.com/data. Please place it into the due_benchmark_data directory after downloading.

TODO: dopisać resztę

2. Run baseline trainings

2.1 Process datasets into memmaps (binarization)

In order to process datasets into memmaps, set the directory downloaded_data_path to downloaded data, set memmap_directory to a new directory that will store binarized datas, and use the following script:

./create_memmaps.sh

2.2 Run training script

Single training can be started with the following command, assuming out_dir is set as an output for the trained model's checkpoints and generated outputs. Additionally, set datas to any of the previously generated datasets (e.g., to DeepForm).

python benchmarker/cli/l5/train.py \
    --model_name_or_path ${downloaded_data_path}/t5-base \
    --relative_bias_args="[{\"type\":\"1d\"}]" \
    --dropout_rate 0.15 \
    --model_type=t5 \
    --output_dir ${out_dir} \
    --data_dir ${memmap_directory}/${datas}_memmap/train \
    --val_data_dir ${memmap_directory}/${datas}_memmap/dev \
    --test_data_dir ${memmap_directory}/${datas}_memmap/test \
    --gpus 1 \
    --max_epochs 30 \
    --train_batch_size 1 \
    --eval_batch_size 2 \
    --overwrite_output_dir \
    --accumulate_grad_batches 64 \
    --max_source_length 1024 \
    --max_target_length 256 \
    --eval_max_gen_length 16 \
    --learning_rate 2e-4 \
    --lr_scheduler constant \
    --warmup_steps 100 \
    --trim_batches \ 
    --do_train \
    --do_predict \ 
    --additional_data_fields doc_id label_name \
    --early_stopping_patience 20 \
    --segment_levels tokens pages \
    --optimizer adamw \
    --weight_decay 1e-5 \
    --adam_epsilon 1e-8 \
    --num_workers 4 \
    --val_check_interval 1

The models presented in the paper differs only in two places. The first is the choice of --relative_bias_args. T5 uses [{'type': '1d'}] whereas both +2D and +DALL-E use [{'type': '1d'}, {'type': 'horizontal'}, {'type': 'vertical'}]

Moreover +DALL-E had --context_embeddings set to [{'dimension': 1024, 'use_position_bias': False, 'embedding_type': 'discrete_vae', 'pretrained_path': '', 'image_width': 256, 'image_height': 256}]

3. Evaluate

3.1 Convert output to the submission file

In order to compare two files (generated by the model with the provided library and the gold-truth answers), one has to convert the generated output into a format that can be directly compared with documents.jsonl. Please use:

python to_submission_file.py ${downloaded_data_path} ${out_dir}

3.2 Evaluate reproduced models

Finally outputs can be evaluated using the provided evaluator. First, get back into main directory, where this README.md is placed and install it by cd due_evaluator-master && pip install -r requirement And run:

python due_evaluator --out-files baselines/test_generations.jsonl --reference ${downloaded_data_path}/DeepForm

3.3 Evaluate baseline outputs

We provide an examples of outputs generated by our baseline (DeepForm). They should be processed with:

python benchmarker-code/to_submission_file.py ${downloaded_data_path}/model_outputs_example ${downloaded_data_path}
python due_evaluator --out-files ./benchmarker/cli/l5/baselines/test_generations.txt.jsonl --reference ${downloaded_data_path}/DeepForm/test/document.jsonl

The expected output should be:

       Label       F1  Precision   Recall
  advertiser 0.512909   0.513793 0.512027
contract_num 0.778761   0.780142 0.777385
 flight_from 0.794376   0.795775 0.792982
   flight_to 0.804921   0.806338 0.803509
gross_amount 0.355476   0.356115 0.354839
         ALL 0.649771   0.650917 0.648630
Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".

PubTables-1M This repository contains training and evaluation code for the paper "PubTables-1M: Towards a universal dataset and metrics for training a

Microsoft 365 Jan 04, 2023
Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Pytorch implementation of MaskGIT: Masked Generative Image Transformer

Dominic Rampas 247 Dec 16, 2022
Joint Learning of 3D Shape Retrieval and Deformation, CVPR 2021

Joint Learning of 3D Shape Retrieval and Deformation Joint Learning of 3D Shape Retrieval and Deformation Mikaela Angelina Uy, Vladimir G. Kim, Minhyu

Mikaela Uy 38 Oct 18, 2022
Benchmark for the generalization of 3D machine learning models across different remeshing/samplings of a surface.

Discretization Robust Correspondence Benchmark One challenge of machine learning on 3D surfaces is that there are many different representations/sampl

Nicholas Sharp 10 Sep 30, 2022
bespoke tooling for offensive security's Windows Usermode Exploit Dev course (OSED)

osed-scripts bespoke tooling for offensive security's Windows Usermode Exploit Dev course (OSED) Table of Contents Standalone Scripts egghunter.py fin

epi 268 Jan 05, 2023
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral) Run this model on Replicate Optimization: Global directions: Mapper: Check ou

3.3k Jan 05, 2023
Optimising chemical reactions using machine learning

Summit Summit is a set of tools for optimising chemical processes. We’ve started by targeting reactions. What is Summit? Currently, reaction optimisat

Sustainable Reaction Engineering Group 75 Dec 14, 2022
Using pretrained GROVER to extract the atomic fingerprints from molecule

Extracting atomic fingerprints from molecules using pretrained Graph Neural Network models (GROVER).

Xuan Vu Nguyen 1 Jan 28, 2022
SpinalNet: Deep Neural Network with Gradual Input

SpinalNet: Deep Neural Network with Gradual Input This repository contains scripts for training different variations of the SpinalNet and its counterp

H M Dipu Kabir 142 Dec 30, 2022
"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

Graph Neural Controlled Differential Equations for Traffic Forecasting Setup Python environment for STG-NCDE Install python environment $ conda env cr

Jeongwhan Choi 55 Dec 28, 2022
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Microsoft 408 Dec 30, 2022
A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

Yutian Liu 2 Jan 29, 2022
ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

Lightweight, efficient and stable implementations of deep reinforcement learning algorithms using PyTorch. 🔥

AI4Finance 2.5k Jan 08, 2023
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

Jaemin Cho 98 Dec 15, 2022
Pytorch implementation of paper: "NeurMiPs: Neural Mixture of Planar Experts for View Synthesis"

NeurMips: Neural Mixture of Planar Experts for View Synthesis This is the official repo for PyTorch implementation of paper "NeurMips: Neural Mixture

James Lin 101 Dec 13, 2022
DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

DatasetGAN This is the official code and data release for: DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort Yuxuan Zhang*, Huan Li

302 Jan 05, 2023
On the Adversarial Robustness of Visual Transformer

On the Adversarial Robustness of Visual Transformer Code for our paper "On the Adversarial Robustness of Visual Transformers"

Rulin Shao 35 Dec 14, 2022
Face recognition with trained classifiers for detecting objects using OpenCV

Face_Detector Face recognition with trained classifiers for detecting objects using OpenCV Libraries required to be installed using pip Command: cv2 n

Chumui Tripura 0 Oct 31, 2021
《Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis》(2021)

Image2Reverb Image2Reverb is an end-to-end neural network that generates plausible audio impulse responses from single images of acoustic environments

Nikhil Singh 48 Nov 27, 2022
PyTorch Personal Trainer: My framework for deep learning experiments

Alex's PyTorch Personal Trainer (ptpt) (name subject to change) This repository contains my personal lightweight framework for deep learning projects

Alex McKinney 8 Jul 14, 2022