CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

Overview

CRAFT-Reimplementation

Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 .

Reimplementation:Character Region Awareness for Text Detection Reimplementation based on Pytorch

Character Region Awareness for Text Detection

Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee (Submitted on 3 Apr 2019)

The full paper is available at: https://arxiv.org/pdf/1904.01941.pdf

Install Requirements:

1、PyTroch>=0.4.1
2、torchvision>=0.2.1
3、opencv-python>=3.4.2
4、check requiremtns.txt
5、4 nvidia GPUs(we use 4 nvidia titanX)

pre-trained model:

NOTE: There are old pre-trained models, I will upload the new results pre-trained models' link.
Syndata:Syndata for baidu drive || Syndata for google drive
Syndata+IC15:Syndata+IC15 for baidu drive || Syndata+IC15 for google drive
Syndata+IC13+IC17:Syndata+IC13+IC17 for baidu drive|| Syndata+IC13+IC17 for google drive

Training

Note: When you train the IC15-Data or MLT-Data, please see the annotation in data_loader.py line 92 and line 108-112.

Train for Syndata

  • download the Syndata(I will give the link)
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainSyndata.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、/data/CRAFT-pytorch/synweights/synweights -> /your_path/real_weights)

  • Run python trainSyndata.py

Train for IC15 data based on Syndata pre-trained model

  • download the IC15 data, rename the image file and the gt file for ch4_training_images and ch4_training_localization_transcription_gt,respectively.
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainic15data.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、/data/CRAFT-pytorch/real_weights -> /your_path/real_weights)

  • change the path in trainic15data.py file:

(1、/data/CRAFT-pytorch/1-7.pth -> /your_path/your_pre-trained_model_name 2、/data/CRAFT-pytorch/icdar1317 -> /your_ic15data_path/)

  • Run python trainic15data.py

Train for IC13+17 data based on Syndata pre-trained model

  • download the MLT data, rename the image file and the gt file,respectively.
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainic-MLT_data.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、savemodel path-> your savemodel path)

  • change the path in trainic-MLT_data.py file:

(1、/data/CRAFT-pytorch/1-7.pth -> /your_path/your_pre-trained_model_name 2、/data/CRAFT-pytorch/icdar1317 -> /your_ic15data_path/)

  • Run python trainic-MLT_data.py

If you want to train for weak supervised use our Syndate pre-trained model:

1、You should first download the pre_trained model trained in the Syndata baidu||google.
2、change the data path and pre-trained model path.
3、run python trainic15data.py

This code supprts for Syndata and icdar2015, and we will release the training code for IC13 and IC17 as soon as possible.

Methods dataset Recall precision H-mean
Syndata ICDAR13 71.93% 81.31% 76.33%
Syndata+IC15 ICDAR15 76.12% 84.55% 80.11%
Syndata+MLT(deteval) ICDAR13 86.81% 95.28% 90.85%
Syndata+MLT(deteval)(new gaussian map method) ICDAR13 90.67% 94.56% 92.57%
Syndata+IC15(new gaussian map method) ICDAR15 80.36% 84.25% 82.26%

We have released the latest code with new gaussian map and random crop algorithm.

Note:new gaussian map method can split the inference gaussian region score map
Sample:

Note:We have solved the problem about detecting big word. Now we are training the model. And any issues or advice are welcome.

Sample:

###weChat QR code

Contributing to the project

We will release training code as soon as possible, and we have not yet reached the results given in the author's paper. Any pull requests or issues are welcome. We also hope that you could give us some advice for the project.

Acknowledgement

Thanks for Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee excellent work and code for test. In this repo, we use the author repo's basenet and test code.

License

For commercial use, please contact us.

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss This is an unofficial implementation of AutoVC based on the official one. The reposi

Chien-yu Huang 27 Jun 16, 2022
Image Recognition Model Generator

Takes a user-inputted query and generates a machine learning image recognition model that determines if an inputted image is or isn't their query

Christopher Oka 1 Jan 13, 2022
零样本学习测评基准,中文版

ZeroCLUE 零样本学习测评基准,中文版 零样本学习是AI识别方法之一。 简单来说就是识别从未见过的数据类别,即训练的分类器不仅仅能够识别出训练集中已有的数据类别, 还可以对于来自未见过的类别的数据进行区分。 这是一个很有用的功能,使得计算机能够具有知识迁移的能力,并无需任何训练数据, 很符合现

CLUE benchmark 27 Dec 10, 2022
Demo processor to illustrate OCR-D Python API

ocrd_vandalize/ Demo processor to illustrate the OCR-D/core Python API Description :TODO: write docs :) Installation From PyPI pip3 install ocrd_vanda

Konstantin Baierer 5 May 05, 2022
OCR powered screen-capture tool to capture information instead of images

NormCap OCR powered screen-capture tool to capture information instead of images. Links: Repo | PyPi | Releases | Changelog | FAQs Content: Quickstart

575 Dec 31, 2022
[ICCV, 2021] Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks This is an official PyTorch code repository of the paper "Cloud Transformers:

Visual Understanding Lab @ Samsung AI Center Moscow 27 Dec 15, 2022
A fastai/PyTorch package for unpaired image-to-image translation.

Unpaired image-to-image translation A fastai/PyTorch package for unpaired image-to-image translation currently with CycleGAN implementation. This is a

Tanishq Abraham 120 Dec 02, 2022
Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Abdulazeez Jimoh 1 Jan 01, 2022
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

francis 125 Dec 30, 2022
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Jan 04, 2023
Rubik's Cube in pygame with OpenGL

Rubik Rubik's Cube in pygame with OpenGL The script show on the screen a Rubik Cube buit with OpenGL. Then I have also implemented all the possible mo

Gabro 2 Apr 15, 2022
Table Extraction Tool

Tree Structure - Table Extraction Fonduer has been successfully extended to perform information extraction from richly formatted data such as tables.

HazyResearch 88 Jun 02, 2022
governance proposal to make fei redeemable for eth

Feil Proposal 🌲 Abstract Migrate all ETH from Fei protocol-controlled value into Yearn ETH Vault. Allow redemptions of outstanding FEI for yvETH. At

13 Mar 31, 2022
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 363 Jan 03, 2023
A curated list of awesome synthetic data for text location and recognition

awesome-SynthText A curated list of awesome synthetic data for text location and recognition and OCR datasets. Text location SynthText SynthText_Chine

Tianzhong 283 Jan 05, 2023
Document blur detection based on Laplacian operator and text detection.

Document Blur Detection For general blurred image, using the variance of Laplacian operator is a good solution. But as for the blur detection of docum

JoeyLr 5 Oct 20, 2022
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Microsoft 235 Dec 22, 2022
This is a implementation of CRAFT OCR method

This is a implementation of CRAFT OCR method

Esaka 0 Nov 01, 2021
a micro OCR network with 0.07mb params.

MicroOCR a micro OCR network with 0.07mb params. Layer (type) Output Shape Param # Conv2d-1 [-1, 64, 8,

william 29 Aug 06, 2022