Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Overview

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Toolset

  • U^2-Net is used for background removal
  • Textcleaner is used for image cleaning and line deskew (max 5 degrees)
  • Tesseract is used for text angle rotation
  • Deskew is used for line deskew (between 5 and 45 degrees)

Examples

Tested one document on smartphone camera with different angles


To build & deploy

  1. Clone thee repo
  2. Download the model: check app/saved_models/README.md
  3. Build Docker image : docker build -t / : .
  4. Test locally : Run Docker image and check if api is working by running http://localhost:10000
    • CPU : docker run -it -v $PWD:/LOCAL/ -p 10000:80 / :
    • GPU : docker run -it --gpus all -v $PWD:/LOCAL/ -p 10000:80 / :
  5. Push docker image to Dockerhub (optional):
  6. Deploy to Cloud Run (optional):
    • Create your google cloud account
    • Push Docker Image to Google Container Registry
      • create new project called [PROJECT-ID]
      • Open Cloud shell in your Google account and run: docker pull / : docker tag [IMAGE] gcr.io/[PROJECT-ID]/[IMAGE] docker push gcr.io/[PROJECT-ID]/[IMAGE] more detail in this link
    • Create CloudRun Service, and select Container that was created
      • Screenshot of the config - for demo purpose, it will be cost free
    • Click Deploy, and test the Api Url that will display

Limits and Areas for improvements

  • Speed: It takes 7 to 10 seconds to process one image (serverless Cloud Run) With Gpu we can save 2 to 3 seconds (U^2-Net is 3 times faster)
  • Textcleaner is slow but works better on image cleaning, but needs some manual fine-tuning

References

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 144 Jan 05, 2023
Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

Marek Mauder 127 Dec 03, 2022
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

ocr-fileformat Validate and transform between OCR file formats (hOCR, ALTO, PAGE, FineReader) Installation Docker System-wide Usage CLI GUI API Transf

Universitätsbibliothek Mannheim 152 Dec 20, 2022
Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

DewarpNet This repository contains the codes for DewarpNet training. Recent Updates [May, 2020] Added evaluation images and an important note about Ma

<a href=[email protected]"> 354 Jan 01, 2023
Rest API Written In Python To Classify NSFW Images.

✨ NSFW Classifier API ✨ Rest API Written In Python To Classify NSFW Images. Fastest Solution If you don't want to selfhost it, there's already an inst

Akshay Rajput 23 Dec 30, 2022
a micro OCR network with 0.07mb params.

MicroOCR a micro OCR network with 0.07mb params. Layer (type) Output Shape Param # Conv2d-1 [-1, 64, 8,

william 29 Aug 06, 2022
Assignment work with webcam

work with webcam : Press key 1 to use emojy on your face Press key 2 to use lip and eye on your face Press key 3 to checkered your face Press key 4 to

Hanane Kheirandish 2 May 31, 2022
Msos searcher - A half-hearted attempt at finding a magic square of squares

MSOS searcher A half-hearted attempt at finding (or rather searching) a MSOS (Magic Square of Squares) in the spirit of the Parker Square. Running I r

Niels Mündler 1 Jan 02, 2022
Tool which allow you to detect and translate text.

Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr

Damian Panek 176 Nov 28, 2022
ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

Christian Bartz 572 Jan 05, 2023
BoxToolBox is a simple python application built around the openCV library

BoxToolBox is a simple python application built around the openCV library. It is not a full featured application to guide you through the w

František Horínek 1 Nov 12, 2021
Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

M.Najafi 4 Feb 18, 2022
Primary QPDF source code and documentation

QPDF QPDF is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption,

QPDF 2.2k Jan 04, 2023
Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

dio-live-textract2 Repositório de código para o live coding do dia 05/10/2021 sobre extração de dados estruturados e gravação em banco de dados a part

hugoportela 0 Jan 19, 2022
This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

Ashutosh Mahapatra 2 Feb 06, 2022
A post-processing tool for scanned sheets of paper.

unpaper Originally written by Jens Gulden — see AUTHORS for more information. Licensed under GNU GPL v2 — see COPYING for more information. Overview u

27 Dec 07, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

camloop Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff Table of Contents Usage Install Quickstart More advanc

Gabriel Lefundes 9 Nov 12, 2021
Create single line SVG illustrations from your pictures

Create single line SVG illustrations from your pictures

Javier Bórquez 686 Dec 26, 2022