This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Last update: Dec 30, 2022

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.

Thanks for the author's (@whai362) awesome work!

Installation

Any version of tensorflow version > 1.0 should be ok.
python 2 or 3 will be ok.

Download

trained on ICDAR 2015 (training set) + ICDAR2017 MLT (training set):

baiduyun extract code: pffd

google drive

This model is not as good as article's, it's just a reference. You can finetune on it or you can do a lot of optimization based on this code.

Database	Precision (%)	Recall (%)	F-measure (%)
ICDAR 2015(val)	74.61	80.93	77.64

Train

If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image, and make sure that gt text and image file have the same names.

Then run train.py like:

python train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 --checkpoint_path=./resnet_v1_50/ \
--training_data_path=./data/ocr/icdar2015/

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu_list=0,1,2,3)

Note:

right now , only support icdar2017 data format input, like (116,1179,206,1179,206,1207,116,1207,"###"), but you can modify data_provider.py to support polygon format input
Already support polygon shrink by using pyclipper module
this re-implementation is just for fun, but I'll continue to improve this code.
re-implementation pse algorithm by using c++ (if you use python2, just run it, if python3, please replace python-config with python3-config in makefile)

Test

run eval.py like:

python eval.py --test_data_path=./tmp/images/ --gpu_list=0 --checkpoint_path=./resnet_v1_50/ \
--output_dir=./tmp/

a text file and result image will be then written to the output path.

Examples

About issues

If you encounter any issue check issues first, or you can open a new issue.

Reference

Acknowledge

@rkshuai found a bug about concat features in model.py.

If this repository helps you，please star it. Thanks.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Related tags

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

Installation

Download

Train

Test

Examples

About issues

Reference

Acknowledge

Owner

Michael liu

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

GDB python tool to pretty print and debug c++ xtensor containers

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

A python screen recorder for low-end computers, provides high quality video output.

textspotter - An End-to-End TextSpotter with Explicit Alignment and Attention

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

Amazing 3D explosion animation using Pygame module.

A toolbox of scene text detection and recognition

Détection de créneaux de vaccination disponibles pour l'outil ViteMaDose

MXNet OCR implementation. Including text recognition and detection.

Convert scans of handwritten notes to beautiful, compact PDFs

OCR of Chicago 1909 Renumbering Plan

OpenMMLab Text Detection, Recognition and Understanding Toolbox

📷 This repository is focused on having various feature implementation of OpenCV in Python.

Balabobapy - Using artificial intelligence algorithms to continue the text

https://arxiv.org/abs/1904.01941

Image Recognition Model Generator

nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

Framework for the Complete Gaze Tracking Pipeline

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Related tags

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

Installation

Download

Train

Test

Examples

About issues

Reference

Acknowledge

Owner

Michael liu

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

GDB python tool to pretty print and debug c++ xtensor containers

Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

A python screen recorder for low-end computers, provides high quality video output.

textspotter - An End-to-End TextSpotter with Explicit Alignment and Attention

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

Amazing 3D explosion animation using Pygame module.

A toolbox of scene text detection and recognition

Détection de créneaux de vaccination disponibles pour l'outil ViteMaDose

MXNet OCR implementation. Including text recognition and detection.

Convert scans of handwritten notes to beautiful, compact PDFs

OCR of Chicago 1909 Renumbering Plan

OpenMMLab Text Detection, Recognition and Understanding Toolbox

📷 This repository is focused on having various feature implementation of OpenCV in Python.

Balabobapy - Using artificial intelligence algorithms to continue the text

https://arxiv.org/abs/1904.01941

Image Recognition Model Generator

nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

Framework for the Complete Gaze Tracking Pipeline

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約