OCR system for Arabic language that converts images of typed text to machine-encoded text.

Last update: Jan 05, 2023

Overview

Arabic OCR

OCR system for Arabic language that converts images of typed text to machine-encoded text.
The system currently supports only letters (29 letters) ا-ى , لا.
The system aims to solve a simpler problem of OCR with images that contain only Arabic characters (check the dataset link below to see a sample of the images).

Setup

Install python then run this command:

pip install -r requirements.txt

Run

Put the images in src/test directory
Go to src directory and run the following command
```
python OCR.py
```
Output folder will be created with:
- text folder which has text files corresponding to the images.
- running_time file which has the time taken to process each image.

Pipeline

Dataset

Link to dataset of images and the corresponding text: here.
We used 1000 images to generate character dataset that we used for training.

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

Average accuracy: 95%.
Average time per image: 16 seconds.

NOTE

We achieved these results when we used only the flatten image as feature.

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Related tags

Overview

Arabic OCR

Setup

Run

Pipeline

Dataset

Examples

Line Segmentation

Word Segmentation

Character Segmentation

Performance

References

Owner

Hussein Youssef

Convert scans of handwritten notes to beautiful, compact PDFs

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

A synthetic data generator for text recognition

aardio的opencv库

Qrcode Attendence System with Opencv and Pyzbar

color detection using python

Deskewing images with slanted content

A simple component to display annotated text in Streamlit apps.

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

This repo contains a script that allows us to find range of colors in images using openCV, and then convert them into geo vectors.

Use Youdao OCR API to covert your clipboard image to text.

BoxToolBox is a simple python application built around the openCV library

A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

TextBoxes re-implement using tensorflow

Drowsiness Detection and Alert System

A novel region proposal network for more general object detection ( including scene text detection ).

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

This project is basically to draw lines with your hand, using python, opencv, mediapipe.