PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Last update: Jul 20, 2022

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The unofficial code of CDistNet.

Now, we have implemented all the modules according to the papaer except for TPS in the visual branch.You can refer ASTER for the implementation of TPS.

Requirements

Python3.6.8
lmdb==0.98
torch==1.5.1
torchvision==0.6.1
Pillow==6.1.0
opencv-python==4.2.0.32
numpy==1.17.1

Data preparation

We offer you a tool to transform raw dataset to LMDB dataset. Details please refer to tools/create_lmdb_dataset.py

You can also download lmdb dataset from OCR_Dataset

Train

First you need to modify some arguments in configs/cdistnet.yml.

TrainReader set the path of train lmdb dataset.
EvalReader set the path of evaluation lmdb dataset.
Global set the args like image_shape, dict_file, etc.
VisualModule set the args of visual branch in the original paper.
PositionalEmbedding set the args of positional branch.
SemanticEmbedding set the args of semantic branch.
MDCDP set the args of MDCDP.

python train.py -c configs/cdistnet.yml

Demo

Modify these arguments below in configs/cdistnet.yml.

pretrain_weights set the path of model file path.
infer_img set the image path.
`is_train set to False.

python predict.py -c configs/cdistnet.yml

TODO

Pretrained models
Test code
Comparison with original paper on benchmarks(CUTE, IC13, IC15, IIIT5K, SVT, SVTP)

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Requirements

Data preparation

Train

Demo

TODO

Owner

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021

NasirKhusraw - The TSP solved using genetic algorithm and show TSP path overlaid on a map of the Iran provinces & their capitals.

Relative Positional Encoding for Transformers with Linear Complexity

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Attention Probe: Vision Transformer Distillation in the Wild

g2o: A General Framework for Graph Optimization

Tools for robust generative diffeomorphic slice to volume reconstruction

Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Sketch-Based 3D Exploration with Stacked Generative Adversarial Networks

This is a JAX implementation of Neural Radiance Fields for learning purposes.

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

Data Engineering ZoomCamp

Constraint-based geometry sketcher for blender

Python based framework for Automatic AI for Regression and Classification over numerical data.

The Pytorch implementation for "Video-Text Pre-training with Learned Regions"

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Collection of sports betting AI tools.

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

SegNet-like Autoencoders in TensorFlow