PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Last update: Dec 20, 2022

Related tags

Computer Vision EAST

Overview

Description

This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector.

Only RBOX part is implemented.
Using dice loss instead of class-balanced cross-entropy loss. Some codes refer to argman/EAST and songdejia/EAST
The pre-trained model provided achieves 82.79 F-score on ICDAR 2015 Challenge 4 using only the 1000 images. see here for the detailed results.

Model	Loss	Recall	Precision	F-score
Original	CE	72.75	80.46	76.41
Re-Implement	Dice	81.27	84.36	82.79

Prerequisites

Only tested on

Anaconda3
Python 3.7.1
PyTorch 1.0.1
Shapely 1.6.4
opencv-python 4.0.0.21
lanms 1.0.2

When running the script, if some module is not installed you will see a notification and installation instructions. if you failed to install lanms, please update gcc and binutils. The update under conda environment is:

conda install -c omgarcia gcc-6
conda install -c conda-forge binutils

The original lanms code has a bug in normalize_poly that the ref vertices are not fixed when looping the p's ordering to calculate the minimum distance. We fixed this bug in LANMS so that anyone could compile the correct lanms. However, this repo still uses the original lanms.

Installation

1. Clone the repo

git clone https://github.com/SakuraRiven/EAST.git
cd EAST

2. Data & Pre-Trained Model

Download Train and Test Data: ICDAR 2015 Challenge 4. Cut the data into four parts: train_img, train_gt, test_img, test_gt.
Download pre-trained VGG16 from PyTorch: VGG16 and our trained EAST model: EAST. Make a new folder pths and put the download pths into pths

mkdir pths
mv east_vgg16.pth vgg16_bn-6c64b313.pth pths/

Here is an example:

.
├── EAST
│   ├── evaluate
│   └── pths
└── ICDAR_2015
    ├── test_gt
    ├── test_img
    ├── train_gt
    └── train_img

Train

Modify the parameters in train.py and run:

CUDA_VISIBLE_DEVICES=0,1 python train.py

Detect

Modify the parameters in detect.py and run:

CUDA_VISIBLE_DEVICES=0 python detect.py

Evaluate

The evaluation scripts are from ICDAR Offline evaluation and have been modified to run successfully with Python 3.7.1.
Change the evaluate/gt.zip if you test on other datasets.
Modify the parameters in eval.py and run:

CUDA_VISIBLE_DEVICES=0 python eval.py

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Related tags

Overview

Description

Prerequisites

Installation

1. Clone the repo

2. Data & Pre-Trained Model

Train

Detect

Evaluate

Owner

Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Polaris is a Face recognition attendance system .

Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

かの有名なあの東方二次創作ソング、「bad apple!」のMVをPythonでやってみたって話

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Automatically download multiple papers by keywords in CVPR

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

Localization of thoracic abnormalities model based on VinBigData (top 1%)

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

fishington.io bot with OpenCV and NumPy

Hand Detection and Finger Detection on Live Feed

virtual mouse which can copy files, close tabs and many other features !

Fun program to overlay a mask to yourself using a webcam

ERQA - Edge Restoration Quality Assessment

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

A Joint Video and Image Encoder for End-to-End Retrieval

Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

PianoVisuals - Create background videos synced with piano music using opencv

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Related tags

Overview

Description

Prerequisites

Installation

1. Clone the repo

2. Data & Pre-Trained Model

Train

Detect

Evaluate

Owner

Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Polaris is a Face recognition attendance system .

Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

かの有名なあの東方二次創作ソング、「bad apple!」のMVをPythonでやってみたって話

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Automatically download multiple papers by keywords in CVPR

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Localization of thoracic abnormalities model based on VinBigData (top 1%)

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

fishington.io bot with OpenCV and NumPy

Hand Detection and Finger Detection on Live Feed

virtual mouse which can copy files, close tabs and many other features !

Fun program to overlay a mask to yourself using a webcam

ERQA - Edge Restoration Quality Assessment

An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

A Joint Video and Image Encoder for End-to-End Retrieval

Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

PianoVisuals - Create background videos synced with piano music using opencv

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約