TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

Last update: Nov 11, 2022

Overview

FOTS: Fast Oriented Text Spotting with a Unified Network

I am still working on this repo. updates and detailed instructions are coming soon!

Table of Contens

TensorFlow Versions
Other Requirements
Trained Models
Datasets
Train
- Pre-train with SynthText
- Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013
Test
References

TensorFlow Versions

As for now, the pre-training code is tested on TensorFlow 1.12, 1.14 and 1.15. I may try to implement 2.x version in the future.

Other Requirements

GCC >= 6

Trained Models

tmp pre-trained model
trained model comming soon

Datasets

pre-training
Synth800k(The dataset is only available for non-commercial research and educational purposes)
finetuning
ICDAR 2015, 2017MLT, 2013

Train

Pre-train with SynthText

Download pre-trained ResNet-50 from TensorFlow-Slim image classification model library page and place it at 'ckpt/resnet_v1_50' dir.

cd ckpt/resnet_v1_50
wget http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz
tar -zxvf resnet_v1_50_2016_08_28.tar.gz
rm resnet_v1_50_2016_08_28.tar.gz

Download Synth800k dataset and place it at data/SynthText/ dir to pre-train the whole net.
Transform(Pre-process) the SynthText data into the ICDAR data format.

python data_provider/SynthText2ICDAR.py

Train with SynthText for 10 epochs(with 1 GPU).

python train.py \
  --max_steps=715625 \
  --gpu_list='0' \
  --checkpoint_path=ckpt/synthText_10eps/ \
  --pretrained_model_path=ckpt/resnet_v1_50/resnet_v1_50.ckpt \
  --training_img_data_dir=data/SynthText/ \
  --training_gt_data_dir=data/SynthText/ \
  --icdar=False \

Visualize pre-pretraining progress with TensorBoard.

tensorboard --logdir=ckpt/synthText_10eps/

Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013

(if you are using the pre-trained model, place all of the files in ckpt/synthText_10eps/)

Combine ICDAR data before training.
1. Place ICDAR data under tmp/ foler.
2. Run the following script to combine the data.
```
python combine_ICDAR_data.py --year [year of ICDAR to train(13 or 15 or 17)]
```

ICDAR 2017 MLT/pre-finetune for ICDAR 2013 or ICDAR 2015 (text detection task only)

Train the pre-trained model with 9,000 images from ICDAR 2017 MLT training and validation datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR17MLT/ \
  --pretrained_model_path=ckpt/synthText_10eps/ \
  --train_stage=0 \
  --training_img_data_dir=data/ICDAR17MLT/imgs/ \
  --training_gt_data_dir=data/ICDAR17MLT/gts/

ICDAR 2015

Train the model with 1,000 images from ICDAR 2015 training dataset and 229 images from ICDAR 2013 training datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR15/ \
  --pretrained_model_path=ckpt/ICDAR17MLT/ \
  --training_img_data_dir=data/ICDAR15+13/imgs/ \
  --training_gt_data_dir=data/ICDAR15+13/gts/

ICDAR 2013(horizontal text only)

Train the model with 229 images from ICDAR 2013 training datasets(with 1 GPU).

python train.py \
  --gpu_list='0' \
  --checkpoint_path=ckpt/ICDAR13/ \
  --pretrained_model_path=ckpt/ICDAR17MLT/ \
  --training_img_data_dir=data/ICDAR13/imgs/ \
  --training_gt_data_dir=data/ICDAR13/gts/

Test

Place some images in test_imgs/ dir and specify a trained checkpoint path to see the test result.

python test.py --test_data_path test_imgs/ --checkpoint_path [checkpoint path]

TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

Related tags

Overview

FOTS: Fast Oriented Text Spotting with a Unified Network

Table of Contens

TensorFlow Versions

Other Requirements

Trained Models

Datasets

Train

Pre-train with SynthText

Finetune with ICDAR 2015, ICDAR 2017 MLT or ICDAR 2013

Test

References

Owner

Masao Taketani

A small C++ implementation of LSTM networks, focused on OCR.

document image degradation

Creating a virtual tv using opencv in python3.

Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

🖺 OCR using tensorflow with attention

基于图像识别的开源RPA工具，理论上可以支持所有windows软件和网页的自动化

Table Extraction Tool

Convert Text-to Handwriting Using Python

基于Paddle框架的PSENet复现

Dataset and Code for ICCV 2021 paper "Real-world Video Super-resolution: A Benchmark Dataset and A Decomposition based Learning Scheme"

BD-ALL-DIGIT - This Is Bangladeshi All Sim Cloner Tools

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

Application that instantly translates sign-language to letters.

Text layer for bio-image annotation.

Erosion and dialation using structure element in OpenCV python

Image processing using OpenCv

Handwritten Character Recognition using CNN

A Screen Translator/OCR Translator made by using Python and Tesseract, the user interface are made using Tkinter. All code written in python.

Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

Document Image Dewarping