Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Overview

DewarpNet

This repository contains the codes for DewarpNet training.

Recent Updates

  • [May, 2020] Added evaluation images and an important note about Matlab SSIM.
  • [Dec, 2020] Added OCR evaluation details.

Training

  • Prepare Data: train.txt & val.txt. Contents should be like:
1/824_8-cp_Page_0503-7Ns0001
1/824_1-cp_Page_0504-2Cw0001
  • Train Shape Network: python trainwc.py --arch unetnc --data_path ./data/DewarpNet/doc3d/ --batch_size 50 --tboard
  • Train Texture Mapping Network: python trainbm.py --arch dnetccnl --img_rows 128 --img_cols 128 --img_norm --n_epoch 250 --batch_size 50 --l_rate 0.0001 --tboard --data_path ./DewarpNet/doc3d

Inference:

  • Run: python infer.py --wc_model_path ./eval/models/unetnc_doc3d.pkl --bm_model_path ./eval/models/dnetccnl_doc3d.pkl --show

Evaluation (Image Metrics):

  • We use the same evaluation code as DocUNet. To reproduce the quantitative results reported in the paper use the images available here.

  • [Important note about Matlab version] We noticed that Matlab 2020a uses a different SSIM implementation which gives a better MS-SSIM score (0.5623). Whereas we have used Matlab 2018b. Please compare the scores according to your Matlab version.

Evaluation (OCR Metrics):

  • The 25 images used for OCR evaluation is /eval/ocr_eval/ocr_files.txt
  • The corresponding ground-truth text is given in /eval/ocr_eval/tess_gt.json
  • For the OCR errors reported in the paper we had used cv2.blur as pre-processing which gives higher error in all the cases. For convenience, we provide the updated numbers (without using blur) in the following table:
Method ED CER ED (no blur) CER (no blur)
DocUNet 1975.86 0.4656(0.263) 1671.80 0.403 (0.256)
DocUNet on Doc3D 1684.34 0.3955 (0.272) 1296.00 0.294 (0.235)
DewarpNet 1288.60 0.3136 (0.248) 1007.28 0.249 (0.236)
DewarpNet (ref) 1114.40 0.2692 (0.234) 812.48 0.204 (0.228)
  • We had used the Tesseract (v4.1.0) default configuration for evaluation with PyTesseract (v0.2.6).

Models:

  • Pre-trained models are available here. These models are captured prior to end-to-end training, thus won't give you the end-to-end results reported in Table 2 of the paper. Use the images provided above to get the exact numbers as Table 2.

Dataset:

  • The doc3D dataset can be downloaded using the scripts here.

More Stuff:

Citation:

If you use the dataset or this code, please consider citing our work-

@inproceedings{SagnikKeICCV2019, 
Author = {Sagnik Das*, Ke Ma*, Zhixin Shu, Dimitris Samaras, Roy Shilkrot}, 
Booktitle = {Proceedings of International Conference on Computer Vision}, 
Title = {DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks}, 
Year = {2019}}   

Acknowledgements:

Owner
[email protected]
Computer Vision Lab at Stony Brook University
<a href=[email protected]">
Text Detection from images using OpenCV

EAST Detector for Text Detection OpenCV’s EAST(Efficient and Accurate Scene Text Detection ) text detector is a deep learning model, based on a novel

Abhishek Singh 88 Oct 20, 2022
Recognizing cropped text in natural images.

ASTER: Attentional Scene Text Recognizer with Flexible Rectification ASTER is an accurate scene text recognizer with flexible rectification mechanism.

Baoguang Shi 681 Jan 02, 2023
PSENet - Shape Robust Text Detection with Progressive Scale Expansion Network.

News Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch. [1] W. Wang, E. Xie, X. L

1.1k Dec 24, 2022
An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Optical_Character_Recognition An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports. As an IOT/Compute

Ramsis Hammadi 1 Feb 12, 2022
Super Mario Game With Python

Super_Mario Hello all this is a simple python program which tries to use our body as a controller for the super mario game Here I have used media pipe

Adarsh Badagala 219 Nov 25, 2022
Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text Recognition"

SEE: Towards Semi-Supervised End-to-End Scene Text Recognition Code for the AAAI 2018 publication "SEE: Towards Semi-Supervised End-to-End Scene Text

Christian Bartz 572 Jan 05, 2023
This project is basically to draw lines with your hand, using python, opencv, mediapipe.

Paint Opencv 📷 This project is basically to draw lines with your hand, using python, opencv, mediapipe. Screenshoots 📱 Tools ⚙️ Python Opencv Mediap

Williams Ismael Bobadilla Torres 3 Nov 17, 2021
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
Document Layout Analysis Projects

Layout_Analysis Introduction This is an implementation of RLSA and X-Y Cut with OpenCV Dependencies OpenCV 3.0+ How to use Compile with g++ : g++ -std

22 Dec 08, 2022
A curated list of promising OCR resources

Call for contributor(paper summary,dataset generation,algorithm implementation and any other useful resources) awesome-ocr A curated list of promising

wanghaisheng 1.6k Jan 04, 2023
document image degradation

ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC

NVIDIA Research Projects 134 Nov 18, 2022
Volume Control using OpenCV

Gesture-Volume-Control Volume Control using OpenCV Here i made volume control using Python and OpenCV in which we can control the volume of our laptop

Mudit Sinha 3 Oct 10, 2021
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 05, 2023
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 08, 2022
This is used to convert a string to an Image with Handwritten Characters.

Text-to-Handwriting-using-python This is used to convert a string to an Image with Handwritten Characters. text_to_handwriting(string: str, save_to: s

Akashdeep Mahata 3 Aug 15, 2022
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

MTLFace This repository contains the PyTorch implementation and the dataset of the paper: When Age-Invariant Face Recognition Meets Face Age Synthesis

Hzzone 120 Jan 05, 2023
[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks This is an official PyTorch code repository of the paper "Cloud Transformers:

Visual Understanding Lab @ Samsung AI Center Moscow 27 Dec 15, 2022
An OCR evaluation tool

dinglehopper dinglehopper is an OCR evaluation tool and reads ALTO, PAGE and text files. It compares a ground truth (GT) document page with a OCR resu

QURATOR-SPK 40 Dec 20, 2022
Page to PAGE Layout Analysis Tool

P2PaLA Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks. 💥 Try our new DEMO for online baseli

Lorenzo Quirós Díaz 180 Nov 24, 2022
STEFANN: Scene Text Editor using Font Adaptive Neural Network

STEFANN: Scene Text Editor using Font Adaptive Neural Network @ The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020.

Prasun Roy 208 Dec 11, 2022