PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Related tags

Computer VisionEAST
Overview

Description

This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector.

  • Only RBOX part is implemented.
  • Using dice loss instead of class-balanced cross-entropy loss. Some codes refer to argman/EAST and songdejia/EAST
  • The pre-trained model provided achieves 82.79 F-score on ICDAR 2015 Challenge 4 using only the 1000 images. see here for the detailed results.
Model Loss Recall Precision F-score
Original CE 72.75 80.46 76.41
Re-Implement Dice 81.27 84.36 82.79

Prerequisites

Only tested on

  • Anaconda3
  • Python 3.7.1
  • PyTorch 1.0.1
  • Shapely 1.6.4
  • opencv-python 4.0.0.21
  • lanms 1.0.2

When running the script, if some module is not installed you will see a notification and installation instructions. if you failed to install lanms, please update gcc and binutils. The update under conda environment is:

conda install -c omgarcia gcc-6
conda install -c conda-forge binutils

The original lanms code has a bug in normalize_poly that the ref vertices are not fixed when looping the p's ordering to calculate the minimum distance. We fixed this bug in LANMS so that anyone could compile the correct lanms. However, this repo still uses the original lanms.

Installation

1. Clone the repo

git clone https://github.com/SakuraRiven/EAST.git
cd EAST

2. Data & Pre-Trained Model

  • Download Train and Test Data: ICDAR 2015 Challenge 4. Cut the data into four parts: train_img, train_gt, test_img, test_gt.

  • Download pre-trained VGG16 from PyTorch: VGG16 and our trained EAST model: EAST. Make a new folder pths and put the download pths into pths

mkdir pths
mv east_vgg16.pth vgg16_bn-6c64b313.pth pths/

Here is an example:

.
├── EAST
│   ├── evaluate
│   └── pths
└── ICDAR_2015
    ├── test_gt
    ├── test_img
    ├── train_gt
    └── train_img

Train

Modify the parameters in train.py and run:

CUDA_VISIBLE_DEVICES=0,1 python train.py

Detect

Modify the parameters in detect.py and run:

CUDA_VISIBLE_DEVICES=0 python detect.py

Evaluate

  • The evaluation scripts are from ICDAR Offline evaluation and have been modified to run successfully with Python 3.7.1.
  • Change the evaluate/gt.zip if you test on other datasets.
  • Modify the parameters in eval.py and run:
CUDA_VISIBLE_DEVICES=0 python eval.py
Owner
I AM IRON MAN
Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

Matt Zucker 1.2k Dec 29, 2022
Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Siva Prakash 11 Jan 02, 2022
Using computer vision method to recognize and calcutate the features of the architecture.

building-feature-recognition In this repository, we accomplished building feature recognition using traditional/dl-assisted computer vision method. Th

4 Aug 11, 2022
Line based ATR Engine based on OCRopy

OCR Engine based on OCRopy and Kraken using python3. It is designed to both be easy to use from the command line but also be modular to be integrated

948 Dec 23, 2022
Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks? Artifact Detection/Correction - Offcial PyTorch Implementation This rep

CHOI HWAN IL 23 Dec 20, 2022
Document Image Dewarping

Document image dewarping using text-lines and line Segments Abstract Conventional text-line based document dewarping methods have problems when handli

Taeho Kil 268 Dec 23, 2022
Write-ups for the SwissHackingChallenge2021 CTF.

SwissHackingChallenge 2021 : Write-ups This repository contains a collection of my write-ups for challenges solved during the SwissHackingChallenge (S

Julien Béguin 3 Jun 07, 2021
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
Captcha Recognition

The objective of this project is to recognize the target numbers in the captcha images correctly which would tell us how good or bad a captcha system has been built.

Mohit Kaushik 5 Feb 20, 2022
⛓ marc is a small, but flexible Markov chain generator

About marc (markov chain) is a small, but flexible Markov chain generator. Usage marc is easy to use. To build a MarkovChain pass the object a sequenc

Max Humber 65 Oct 27, 2022
FOTS Pytorch Implementation

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored. ICDAR Dataset SynthText 800K Dataset detectio

Ning Lu 599 Dec 19, 2022
POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Python Optimal Transport 1.7k Jan 04, 2023
Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

Code for the AAAI18 paper PixelLink: Detecting Scene Text via Instance Segmentation, by Dan Deng, Haifeng Liu, Xuelong Li, and Deng Cai. Contributions

758 Dec 22, 2022
chineseocr/table_line 表格线检测模型pytorch版

table_line_pytorch chineseocr/table_detct 表格线检测模型table_line pytorch版 原项目github: https://github.com/chineseocr/table-detect 1、模型转换 下载原项目table_detect模型文

1 Oct 21, 2021
Sort By Face

Sort-By-Face This is an application with which you can either sort all the pictures by faces from a corpus of photos or retrieve all your photos from

0 Nov 29, 2021
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

This project proposes a camera vision based cursor control system, using hand moment captured from a webcam through a landmarks of hand by using Mideapipe module

Chandru 2 Feb 20, 2022
Automatic Number Plate Recognition (ANPR) is a highly accurate system capable of reading vehicle number plates without human intervention

ANPR ANPR is therefore the underlying technology used to find a vehicle license/number plate and it, in turn, supplies this information to a next stag

Melih Emin Kılıçoğlu 1 Jan 09, 2022