This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Last update: Dec 22, 2022

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: https://papers.nips.cc/paper/6637-gated-recurrent-convolution-neural-network-for-ocr.pdf

Update

The journal version of GRCNN has been accepted by T-PAMI 2021, and the code is available at:

https://github.com/Jianf-Wang/GRCNN

Build

The GRCNN is built upon the CRNN. The requirements are:

Ubuntu 14.04
CUDA 7.5
CUDNN 5

For the convenience of compiling, we provide the dependencies from here: https://pan.baidu.com/s/1c21zl1e#list/path=%2F

It is more convenient if you use nivdia-docker image (@rremani supplied) : https://hub.docker.com/r/rremani/cuda_crnn_torch/

After installing the dependencies, go to src/ and execute build_cpp.sh to build the C++ code. If successful, a file named libcrnn.so should be produced in the src/ directory.

Inference

We provide the pretrained model from here. Put the downloaded model file into directory model/GRCL/. Moreover, we provide the IC03 dataset in the "./data/IC03" directory. You need to change the directories listed in the "test.txt". The "test_label.txt" is the ground truth of each image. The "lexicon_50.txt" is the lexicon of IC03.

"src/evaluation.lua": Lexicon-free evaluation

"src/evaluation_lex.lua" Lexicon-based evaluation

The evaluation code will output the recognition accuracy.

Train a new model

Follow the following steps to train a new model on your own dataset.

Create a new LMDB dataset.src/create_own_dataset.py(need to pip install lmdb first).
You can modify the configuration in model/GRCL/GRCL_LSTM_pretrain.lua
Go to src/ and execute th main_train.lua ../model/GRCL/ ../model/saved_model. Model snapshots will be saved into ../model/saved_model.

Visualization

We visualize the RCNN , DenseNet and GRCNN to verify the dynamic receptive fields in GRCNN for OCR. There are clearly gaps among different characters, and for each character, the unrelated parts do not provide strong signal.

Citation

@inproceedings{jianfeng2017deep,
 author    = {Wang, Jianfeng and Hu, Xiaolin},
 title     = {Gated Recurrent Convolution Neural Network for OCR},
 booktitle = {Advances in Neural Information Processing Systems},
 year      = {2017}
}

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Related tags

Overview

Gated Recurrent Convolution Neural Network for OCR

Update

Build

Inference

Train a new model

Visualization

Citation

Owner

2 telegram-bots: for image recognition and for text generation

The code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Long-term Action Assessment".

Color Picker and Color Detection tool for METR4202

Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

A curated list of papers, code and resources pertaining to image composition

Discord QR Scam Code Generator + Token grab mobile device.

Shape Detection - It's a shape detection project with OpenCV and Python.

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

BNF Globalization Code (CVPR 2016)

STEFANN: Scene Text Editor using Font Adaptive Neural Network

This is used to convert a string to an Image with Handwritten Characters.

Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

Rotational region detection based on Faster-RCNN.

Write-ups for the SwissHackingChallenge2021 CTF.

Optical character recognition for Japanese text, with the main focus being Japanese manga

A python program to block out your face