PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Last update: Jul 20, 2022

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The unofficial code of CDistNet.

Now, we have implemented all the modules according to the papaer except for TPS in the visual branch.You can refer ASTER for the implementation of TPS.

Requirements

Python3.6.8
lmdb==0.98
torch==1.5.1
torchvision==0.6.1
Pillow==6.1.0
opencv-python==4.2.0.32
numpy==1.17.1

Data preparation

We offer you a tool to transform raw dataset to LMDB dataset. Details please refer to tools/create_lmdb_dataset.py

You can also download lmdb dataset from OCR_Dataset

Train

First you need to modify some arguments in configs/cdistnet.yml.

TrainReader set the path of train lmdb dataset.
EvalReader set the path of evaluation lmdb dataset.
Global set the args like image_shape, dict_file, etc.
VisualModule set the args of visual branch in the original paper.
PositionalEmbedding set the args of positional branch.
SemanticEmbedding set the args of semantic branch.
MDCDP set the args of MDCDP.

python train.py -c configs/cdistnet.yml

Demo

Modify these arguments below in configs/cdistnet.yml.

pretrain_weights set the path of model file path.
infer_img set the image path.
`is_train set to False.

python predict.py -c configs/cdistnet.yml

TODO

Pretrained models
Test code
Comparison with original paper on benchmarks(CUTE, IC13, IC15, IIIT5K, SVT, SVTP)

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Requirements

Data preparation

Train

Demo

TODO

Owner

Repository for open research on optimizers.

A library for performing coverage guided fuzzing of neural networks

A new data augmentation method for extreme lighting conditions.

Airbus Ship Detection Challenge

Galactic and gravitational dynamics in Python

DenseNet Implementation in Keras with ImageNet Pretrained Models

Encode and decode text application

This is the first released system towards complex meters` detection and recognition, which is implemented by computer vision techniques.

Self-Supervised Generative Style Transfer for One-Shot Medical Image Segmentation

Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

Housing Price Prediction

Machine-in-the-Loop Rewriting for Creative Image Captioning

Message Passing on Cell Complexes

PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street

A library for efficient similarity search and clustering of dense vectors.

A simple, fast, and efficient object detector without FPN

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

a minimal terminal with python 😎😉

Breaching - Breaching privacy in federated learning scenarios for vision and text