Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Last update: Jan 07, 2023

Overview

Handwritten Line Text Recognition using Deep Learning with Tensorflow

Description

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train. More read this Medium Post

Why Deep Learning?

Deep Learning self extracts features with a deep neural networks and classify itself. Compare to traditional Algorithms it performance increase with Amount of Data.

Basic Intuition on How it Works.

First Use Convolutional Recurrent Neural Network to extract the important features from the handwritten line text Image.

The output before CNN FC layer (512x100x8) is passed to the BLSTM which is for sequence dependency and time-sequence operations.

Then CTC LOSS Alex Graves is used to train the RNN which eliminate the Alignment problem in Handwritten, since handwritten have different alignment of every writers. We just gave the what is written in the image (Ground Truth Text) and BLSTM output, then it calculates loss simply as -log("gtText"); aim to minimize negative maximum likelihood path.

Finally CTC finds out the possible paths from the given labels. Loss is given by for (X,Y) pair is:

Finally CTC Decode is used to decode the output during Prediction.

Detail Project Workflow

Project consists of Three steps:
1. Multi-scale feature Extraction --> Convolutional Neural Network 7 Layers
2. Sequence Labeling (BLSTM-CTC) --> Recurrent Neural Network (2 layers of LSTM) with CTC
3. Transcription --> Decoding the output of the RNN (CTC decode)

Requirements

Tensorflow 1.8.0
Flask
Numpy
OpenCv 3
Spell Checker autocorrect >=0.3.0 pip install autocorrect

Dataset Used

IAM dataset download from here
Only needed the lines images and lines.txt (ASCII).
Place the downloaded files inside data directory

The Trained model is available and download from this link. The trained model CER=8.32% and trained on IAM dataset with some additional created dataset.

To Train the model from scratch

$ python main.py --train

To validate the model

$ python main.py --validate

To Prediction

$ python main.py

Run in Web with Flask

$ python upload.py
Validation character error rate of saved model: 8.654728%
Python: 3.6.4 
Tensorflow: 1.8.0
Init with stored values from ../model/snapshot-24
Without Correction clothed leaf by leaf with the dioappoistmest
With Correction clothed leaf by leaf with the dioappoistmest

Prediction output on IAM Test Data

Prediction output on Self Test Data

See the project Devnagari Handwritten Word Recognition with Deep Learning for more insights.

Further Improvement

Using MDLSTM to recognize whole paragraph at once Scan, Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTM Attention
Line segementation can be added for full paragraph text recognition. For line segmentation you can use A* path planning algorithm or CNN model to seperate paragraph into lines.
Better Image preprocessing such as: reduce backgoround noise to handle real time image more accurately.
Better Decoding approach to improve accuracy. Some of the CTC Decoder found here

Feel Free to improve this project with pull Request.

This is part of my last semester project of Computer Engineering From Tribhuvan University. July 2019

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Related tags

Overview

Handwritten Line Text Recognition using Deep Learning with Tensorflow

Description

Why Deep Learning?

Basic Intuition on How it Works.

Detail Project Workflow

Requirements

Dataset Used

The Trained model is available and download from this link. The trained model CER=8.32% and trained on IAM dataset with some additional created dataset.

Further Improvement

Owner

sushant097

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

A simple QR-Code Reader in Python

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Driver Drowsiness Detection with OpenCV & Dlib

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

Repository collecting all the submodules for the new PyTorch-based OCR System.

A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.

Learn computer graphics by writing GPU shaders!

A machine learning software for extracting information from scholarly documents

Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

OCR system for Arabic language that converts images of typed text to machine-encoded text.

CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

Learning Camera Localization via Dense Scene Matching, CVPR2021

The first open-source library that detects the font of a text in a image.

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

An organized collection of tutorials and projects created for aspriring computer vision students.

A real-time dolly zoom camera effect

The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".

Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF