Key information extraction from invoice document with Graph Convolution Network

Last update: Dec 16, 2022

Overview

Key Information Extraction from Scanned Invoices

Key information extraction from invoice document with Graph Convolution Network

Related blog post from my Viblo account: https://viblo.asia/p/djeZ1yPGZWz

Models

Background subtraction: U2Net
Image alignment: based-on output of text-detection & cv2
Text detection: CRAFT and an in-house text-detection model
Text recognition: VietOCR and an in-house text-recognition model
KIE: Graph Convolution

Currently, I dont have the invoice-direction classifier model. But you can also develop a model to rotate the image if the image is rotated horizontally or upside down.

Pretrained model

Google Drive

Data

MC-OCR, a Vietnamese receipts dataset: https://aihub.vn/competitions/1
Preprocessed data: Google Drive

Pipeline

TODO

Command

Create virtual environment using conda or virtualenv

# with virtualenv
virtualenv -p python3 invoice_env
# activate environment
source invoice_env/bin/activate
# install prerequisite libraries
pip install -r requirements.txt

# 1st command, run API
make serve
# 2nd command, run web-gui with streamlit
make runapp

Then access the localhost server at: 0.0.0.0:7778

Preview

TODO

Add preprocess data script

Reference

MC-OCR dataset: https://aihub.vn/competitions/1
U2Net: https://github.com/xuebinqin/U-2-Net
CRAFT: https://github.com/clovaai/CRAFT-pytorch
VietOCR: https://github.com/pbcquoc/vietocr
Benchmarking GNNs: https://github.com/graphdeeplearning/benchmarking-gnns
PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR

Key information extraction from invoice document with Graph Convolution Network

Related tags

Overview

Key Information Extraction from Scanned Invoices

Models

Pretrained model

Data

Pipeline

Command

Preview

TODO

Reference

Owner

Phan Hoang

Codes for our paper "SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge" (EMNLP 2020)

A custom DeepStack model that has been trained detecting ONLY the USPS logo

TLXZoo - Pre-trained models based on TensorLayerX

A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

a reccurrent neural netowrk that when trained on a peice of text and fed a starting prompt will write its on 250 character text using LSTM layers

Collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning.

Distributed Evolutionary Algorithms in Python

Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

SBINN: Systems-biology informed neural network

Dynamic Realtime Animation Control

Bald-to-Hairy Translation Using CycleGAN

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

TransVTSpotter: End-to-end Video Text Spotter with Transformer

FairMOT for Multi-Class MOT using YOLOX as Detector

PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

A Joint Video and Image Encoder for End-to-End Retrieval

Course about deep learning for computer vision and graphics co-developed by YSDA and Skoltech.

Beancount-mercury - Beancount importer for Mercury Startup Checking

Code for reproducing experiments in "Improved Training of Wasserstein GANs"