Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Overview

Convolutional Recurrent Neural Network

This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC loss for image-based sequence recognition tasks, such as scene text recognition and OCR. For details, please refer to our paper http://arxiv.org/abs/1507.05717.

UPDATE Mar 14, 2017 A Docker file has been added to the project. Thanks to @varun-suresh.

UPDATE May 1, 2017 A PyTorch port has been made by @meijieru.

UPDATE Jun 19, 2017 For an end-to-end text detector+recognizer, check out the CTPN+CRNN implementation by @AKSHAYUBHAT.

Build

The software has only been tested on Ubuntu 14.04 (x64). CUDA-enabled GPUs are required. To build the project, first install the latest versions of Torch7, fblualib and LMDB. Please follow their installation instructions respectively. On Ubuntu, lmdb can be installed by apt-get install liblmdb-dev.

To build the project, go to src/ and execute sh build_cpp.sh to build the C++ code. If successful, a file named libcrnn.so should be produced in the src/ directory.

Run demo

A demo program can be found in src/demo.lua. Before running the demo, download a pretrained model from here. Put the downloaded model file crnn_demo_model.t7 into directory model/crnn_demo/. Then launch the demo by:

th demo.lua

The demo reads an example image and recognizes its text content.

Example image: Example Image

Expected output:

Loading model...
Model loaded from ../model/crnn_demo/model.t7
Recognized text: available (raw: a-----v--a-i-l-a-bb-l-e---)

Another example: Example Image2

Recognized text: shakeshack (raw: ss-h-a--k-e-ssh--aa-c--k--)

Use pretrained model

The pretrained model can be used for lexicon-free and lexicon-based recognition tasks. Refer to the functions recognizeImageLexiconFree and recognizeImageWithLexicion in file utilities.lua for details.

Train a new model

Follow the following steps to train a new model on your own dataset.

  1. Create a new LMDB dataset. A python program is provided in tool/create_dataset.py. Refer to the function createDataset for details (need to pip install lmdb first).
  2. Create model directory under model/. For example, model/foo_model. Then create configuraton file config.lua under the model directory. You can copy model/crnn_demo/config.lua and do modifications.
  3. Go to src/ and execute th main_train.lua ../models/foo_model/. Model snapshots and logging file will be saved into the model directory.

Build using docker

  1. Install docker. Follow the instructions here
  2. Install nvidia-docker - Follow the instructions here
  3. Clone this repo, from this directory run docker build -t crnn_docker .
  4. Once the image is built, the docker can be run using nvidia-docker run -it crnn_docker.

Citation

Please cite the following paper if you are using the code/model in your research paper.

@article{ShiBY17,
  author    = {Baoguang Shi and
               Xiang Bai and
               Cong Yao},
  title     = {An End-to-End Trainable Neural Network for Image-Based Sequence Recognition
               and Its Application to Scene Text Recognition},
  journal   = {{IEEE} Trans. Pattern Anal. Mach. Intell.},
  volume    = {39},
  number    = {11},
  pages     = {2298--2304},
  year      = {2017}
}

Acknowledgements

The authors would like to thank the developers of Torch7, TH++, lmdb-lua-ffi and char-rnn.

Please let me know if you encounter any issues.

Owner
Baoguang Shi
Researcher at Microsoft
Baoguang Shi
virtual mouse which can copy files, close tabs and many other features !

AI Virtual Mouse Controller Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip loca

Diwas Pandey 23 Oct 05, 2021
Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Sergio Díaz Fernández 1 Jan 13, 2022
Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Table of Contents Overview Requirements Demo Modules Overview This python package contains modules to help with finding and extracting tabular data fr

Eric Ihli 311 Dec 24, 2022
Table Extraction Tool

Tree Structure - Table Extraction Fonduer has been successfully extended to perform information extraction from richly formatted data such as tables.

HazyResearch 88 Jun 02, 2022
BNF Globalization Code (CVPR 2016)

Boundary Neural Fields Globalization This is the code for Boundary Neural Fields globalization method. The technical report of the method can be found

25 Apr 15, 2022
A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.

The project is based on older versions of tesseract and other tools, and is now superseded by another project which allows for more granular control o

Maxim 32 Jul 24, 2022
📷 Face Recognition using Haar-Cascade Classifier, OpenCV, and Python

Face-Recognition-System Face Recognition using Haar-Cascade Classifier, OpenCV and Python. This project is based on face detection and face recognitio

1 Jan 10, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

M-LSD-warpPerspective-Example M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラムです。 Requirements OpenCV 3.4.2 or Later tensorflow 2.4.1 or Later Usage 実行方法は以下です。 pytho

KazuhitoTakahashi 9 Oct 14, 2022
Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Image Thresholding is used for image segmentation. From a grayscale image, thresholding can be used to create binary images. In thresholding we pick a threshold T.

Grace Ugochi Nneji 3 Feb 15, 2022
🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

Charset Detection, for Everyone 👋 The Real First Universal Charset Detector A library that helps you read text from an unknown charset encoding. Moti

TAHRI Ahmed R. 332 Dec 31, 2022
Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Fusion-360-Add-In-PuzzleSpline Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that sli

Michiel van Wessem 1 Nov 15, 2021
document image degradation

ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC

NVIDIA Research Projects 134 Nov 18, 2022
Hand gesture detection project with aweome UI implementation.

an awesome hand gesture detection project for you to be creative! Imagination is the limit to do with this project.

AR Ashraf 39 Sep 26, 2022
An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

PyTorch implementation of Learning by Aligning (ICCV 2021) This is an official PyTorch implementation of the paper "Learning by Aligning: Visible-Infr

CV Lab @ Yonsei University 30 Nov 05, 2022
Character Segmentation using TensorFlow

Character Segmentation Segment characters and spaces in one text line,from this paper Chinese English mixed Character Segmentation as Semantic Segment

26 Aug 25, 2022
An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

OpenCV-ToothPaint3-Advanced-Digital-Image-Editor This application named ‘Tooth Paint’ version TP_2020.3 (64-bit) or version 3 was developed within a w

JunHong 1 Nov 05, 2021
This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vectors.

Vectorizing color range This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vect

Development Seed 9 Jul 27, 2022
text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

text-detection-ctpn Scene text detection based on ctpn (connectionist text proposal network). It is implemented in tensorflow. The origin paper can be

Shaohui Ruan 3.3k Dec 30, 2022
Repositório para registro de estudo da biblioteca opencv (Python)

OpenCV (Python) Objetivo do Repositório: Registrar avanços no estudo da biblioteca opencv. O repositório estará aberto a qualquer pessoa e há tambem u

1 Jun 14, 2022