Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Last update: Sep 15, 2022

Related tags

Deep Learning QVR-SimpleDLM

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Environment

CUDA="11.0"
CUDNN="8"
UBUNTU="18.04"

Install

bash install.sh
git clone https://github.com/NVIDIA/apex && cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
pip install .
# under our project root folder
pip install .

Data Preparation

Our model is pre-trained on IIT-CDIP dataset, fine-tuned on FUNSD train set and evaluated on FUNSD test set and INV-CDIP test set.

Download our processed OCR results of IIT-CDIP with hocr_list_addr.txt and put under PRETRAIN_DATA_FOLDER/.
Download our processed FUNSD and INV-CDIP datasets and put under DATA_DIR/.

Reproduce Our Results

Download our model fine-tuned on FUNSD here.
Do inference following

# $MODEL_PATH here is where you save the fine-tuned model.
# DATASET_NAME is FUNSD or INV-CDIP.
bash reproduce_results.sh $MODEL_PATH $DATA_DIR/DATASET_NAME

You should get the following results.

Datasets	Precision	Recall	F1
FUNSD	60.4	60.9	60.7
INV-CDIP	50.5	47.6	49.0

Pre-training

You can skip the following steps by downloading our pre-trained SimpleDLM model here.
Or download layoutlm-base-uncased.
Do pre-training following

# $NUM_GPUS is the number of gpus you want to do the pretraining on. To reproduce the paper's results we recommend to use 8 gpus.
# $MODEL_PATH here is where you save the LayoutLM model.
# $PRETRAIN_DATA_FOLDER is the folder of IIT-CDIP hocr files.

python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS pretraining.py \
--model_name_or_path $MODEL_PATH  --data_dir $PRETRAIN_DATA_FOLDER \
--output_dir $OUTPUT_DIR

Fine-tuning

Do fine-tuning following

# $MODEL_PATH is where you save the pre-trained simpleDLM model.

CUDA_VISIBLE_DEVICES=0 python run_query_value_retrieval.py --model_type simpledlm --model_name_or_path $MODEL_PATH \
--data_dir $DATA_DIR/FUNSD/ --output_dir $OUTPUT_DIR --do_train --evaluate_during_training

Citation

If you find this codebase useful, please cite our paper:

@article{gao2021value,
  title={Value Retrieval with Arbitrary Queries for Form-like Documents},
  author={Gao, Mingfei and Xue, Le and Ramaiah, Chetan and Xing, Chen and Xu, Ran and Xiong, Caiming},
  journal={arXiv preprint arXiv:2112.07820},
  year={2021}
}

Contact

Please send an email to [email protected] or [email protected] if you have questions.

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Related tags

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Environment

Install

Data Preparation

Reproduce Our Results

Pre-training

Fine-tuning

Citation

Contact

Owner

Salesforce

Tooling for the Common Objects In 3D dataset.

(EI 2022) Controllable Confidence-Based Image Denoising

PyTorch implementation(s) of various ResNet models from Twitch streams.

A platform for intelligent agent learning based on a 3D open-world FPS game developed by Inspir.AI.

Indices Matter: Learning to Index for Deep Image Matting

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Projecting interval uncertainty through the discrete Fourier transform

Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Pytorch Implementation of Residual Vision Transformers(ResViT)

Code accompanying the paper Shared Independent Component Analysis for Multi-subject Neuroimaging

Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

A knowledge base construction engine for richly formatted data

A Pytorch Implementation of a continuously rate adjustable learned image compression framework.

A reimplementation of DCGAN in PyTorch

git《USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation》(2020) GitHub: [fig2]

Localizing Visual Sounds the Hard Way

The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

Meta Learning Backpropagation And Improving It (VSML)