TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

Last update: Dec 12, 2022

Overview

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection

Introduction

The code and trained models of:

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection, TIP 2019 [Paper]

Citation

Please cite the related works in your publications if it helps your research:


@article{xu2018textfield,
  title={TextField: Learning A Deep Direction Field for Irregular Scene Text Detection},
  author={Xu, Yongchao and Wang, Yukang and Zhou, Wei and Wang, Yongpan and Yang, Zhibo and Bai, Xiang},
  journal={arXiv preprint arXiv:1812.01393},
  year={2018}
}

Prerequisite

Caffe and SynthText pretrained model [Link]
Datasets: [Total-Text], [ICDAR2015]
OpenCV 3.4.3
MATLAB

Usage

1. Install Caffe

cp Makefile.config.example Makefile.config
# adjust Makefile.config (for example, enable python layer)
make all -j16
# make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make pycaffe

Please refer to Caffe Installation to ensure other dependencies.

2. Data and model preparation

# download datasets and pretrained model then
mkdir data && mv [your_dataset_folder] data/
mkdir models && mv [your_pretrained_model] models/

3. Training scripts

# an example on Total-Text dataset
cd examples/TextField/
python train.py --gpu [your_gpu_id] --dataset total --initmodel ../../models/synth_iter_800000.caffemodel

4. Evaluation scripts

# an example on Total-Text dataset
cd evaluation/total/
./eval.sh

Results and Trained Models

Total-Text

Recall	Precision	F-measure	Link
0.816	0.824	0.820	[Google drive]

*lambda=0.50 for post-processing

ICDAR2015

Recall	Precision	F-measure	Link
0.811	0.846	0.828	[Google drive]

*lambda=0.75 for post-processing

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

Related tags

Overview

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection

Introduction

Citation

Prerequisite

Usage

1. Install Caffe

2. Data and model preparation

3. Training scripts

4. Evaluation scripts

Results and Trained Models

Total-Text

ICDAR2015

Owner

Yukang Wang

Simple app for visual editing of Page XML files

A pkg stiching around view images(4-6cameras) to generate bird's eye view.

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

Extract tables from scanned image PDFs using Optical Character Recognition.

Some bits of javascript to transcribe scanned pages using PageXML

ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

The world's simplest facial recognition api for Python and the command line

Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

Select range and every time the screen changes, OCR is activated.

Deep Learning Chinese Word Segment

Binarize document images

M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

This repository contains codes on how to handle mouse event using OpenCV

Ocular is a state-of-the-art historical OCR system.

a micro OCR network with 0.07mb params.

Semantic-based Patch Detection for Binary Programs

Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

One Metrics Library to Rule Them All!

Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"