Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Last update: Jul 13, 2022

Related tags

Computer Vision u2netscan

Overview

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Toolset

U^2-Net is used for background removal
Textcleaner is used for image cleaning and line deskew (max 5 degrees)
Tesseract is used for text angle rotation
Deskew is used for line deskew (between 5 and 45 degrees)

Examples

Tested one document on smartphone camera with different angles

To build & deploy

Clone thee repo
Download the model: check app/saved_models/README.md
Build Docker image : docker build -t / : .
Test locally : Run Docker image and check if api is working by running http://localhost:10000
- CPU : docker run -it -v $PWD:/LOCAL/ -p 10000:80 / :
- GPU : docker run -it --gpus all -v $PWD:/LOCAL/ -p 10000:80 / :
Push docker image to Dockerhub (optional):
- Check: https://docs.docker.com/docker-hub/repos/ for account setup
- Create in Dockerhub Repo similar to the name of yout Image ID :
- Run docker push / :
Deploy to Cloud Run (optional):
- Create your google cloud account
- Push Docker Image to Google Container Registry
  - create new project called [PROJECT-ID]
  - Open Cloud shell in your Google account and run: docker pull / : docker tag [IMAGE] gcr.io/[PROJECT-ID]/[IMAGE] docker push gcr.io/[PROJECT-ID]/[IMAGE] more detail in this link
- Create CloudRun Service, and select Container that was created
  - Screenshot of the config - for demo purpose, it will be cost free
- Click Deploy, and test the Api Url that will display

Limits and Areas for improvements

Speed: It takes 7 to 10 seconds to process one image (serverless Cloud Run) With Gpu we can save 2 to 3 seconds (U^2-Net is 3 times faster)
Textcleaner is slow but works better on image cleaning, but needs some manual fine-tuning

References

U^2-Net https://github.com/xuebinqin/U-2-Net.git
Textcleaner http://www.fmwconcepts.com/imagemagick/textcleaner/
Tesseract https://github.com/tesseract-ocr/tesseract
Deskew https://github.com/sbrunner/deskew.git

Owner

AI

GitHub Repository https://amtam0.github.io/u2netscan/webapp/app_u2net.html

OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

144 Jan 05, 2023

Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

127 Dec 03, 2022

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

ocr-fileformat Validate and transform between OCR file formats (hOCR, ALTO, PAGE, FineReader) Installation Docker System-wide Usage CLI GUI API Transf

152 Dec 20, 2022

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

DewarpNet This repository contains the codes for DewarpNet training. Recent Updates [May, 2020] Added evaluation images and an important note about Ma

[email protected]"> 354 Jan 01, 2023

Rest API Written In Python To Classify NSFW Images.

✨ NSFW Classifier API ✨ Rest API Written In Python To Classify NSFW Images. Fastest Solution If you don't want to selfhost it, there's already an inst

23 Dec 30, 2022

a micro OCR network with 0.07mb params.

MicroOCR a micro OCR network with 0.07mb params. Layer (type) Output Shape Param # Conv2d-1 [-1, 64, 8,

29 Aug 06, 2022

Assignment work with webcam

work with webcam : Press key 1 to use emojy on your face Press key 2 to use lip and eye on your face Press key 3 to checkered your face Press key 4 to

2 May 31, 2022

Msos searcher - A half-hearted attempt at finding a magic square of squares

MSOS searcher A half-hearted attempt at finding (or rather searching) a MSOS (Magic Square of Squares) in the spirit of the Parker Square. Running I r

1 Jan 02, 2022

Tool which allow you to detect and translate text.

Text detection and recognition This repository contains tool which allow to detect region with text and translate it one by one. Description Two pretr

176 Nov 28, 2022

ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022

Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

572 Jan 05, 2023

BoxToolBox is a simple python application built around the openCV library

BoxToolBox is a simple python application built around the openCV library. It is not a full featured application to guide you through the w

1 Nov 12, 2021

Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

4 Feb 18, 2022

Primary QPDF source code and documentation

QPDF QPDF is a command-line tool and C++ library that performs content-preserving transformations on PDF files. It supports linearization, encryption,

2.2k Jan 04, 2023

Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

dio-live-textract2 Repositório de código para o live coding do dia 05/10/2021 sobre extração de dados estruturados e gravação em banco de dados a part

0 Jan 19, 2022

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

2 Feb 06, 2022

A post-processing tool for scanned sheets of paper.

unpaper Originally written by Jens Gulden — see AUTHORS for more information. Licensed under GNU GPL v2 — see COPYING for more information. Overview u

27 Dec 07, 2022

kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

12.3k Jan 05, 2023

Make OpenCV camera loops less of a chore by skipping the boilerplate and getting right to the interesting stuff

camloop Forget the boilerplate from OpenCV camera loops and get to coding the interesting stuff Table of Contents Usage Install Quickstart More advanc

9 Nov 12, 2021

Create single line SVG illustrations from your pictures

Create single line SVG illustrations from your pictures

686 Dec 26, 2022