Official implementations of PSENet, PAN and PAN++.

Overview

News

  • (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23.
  • (2021/04/08) PSENet and PAN are included in MMOCR.

Introduction

This repository contains the official implementations of PSENet, PAN, PAN++, and FAST [coming soon].

Text Detection
Text Spotting

Installation

First, clone the repository locally:

git clone https://github.com/whai362/pan_pp.pytorch.git

Then, install PyTorch 1.1.0+, torchvision 0.3.0+, and other requirements:

conda install pytorch torchvision -c pytorch
pip install -r requirement.txt

Finally, compile codes of post-processing:

# build pse and pa algorithms
sh ./compile.sh

Dataset

Please refer to dataset/README.md for dataset preparation.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py ${CONFIG_FILE}

For example:

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_ic15.py

Testing

Evaluate the performance

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}
cd eval/
./eval_{DATASET}.sh

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar
cd eval/
./eval_ic15.sh

Evaluate the speed

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar --report_speed

Citation

Please cite the related works in your publications if it helps your research:

PSENet

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection with Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

PAN

@inproceedings{wang2019efficient,
  title={Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network},
  author={Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={8440--8449},
  year={2019}
}

PAN++

@article{wang2021pan++,
  title={PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Liu, Xuebo and Liang, Ding and Zhibo, Yang and Lu, Tong and Shen, Chunhua},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2021},
  publisher={IEEE}
}

FAST

@misc{chen2021fast,
  title={FAST: Searching for a Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation}, 
  author={Zhe Chen and Wenhai Wang and Enze Xie and ZhiBo Yang and Tong Lu and Ping Luo},
  year={2021},
  eprint={2111.02394},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

License

This project is developed and maintained by IMAGINE [email protected] Key Laboratory for Novel Software Technology, Nanjing University.

IMAGINE Lab

This project is released under the Apache 2.0 license.

Comments
  • Evaluation of the performance result

    Evaluation of the performance result

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by dikubab 9
  • _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    _pickle.PicklingError: Can't pickle : import of module 'cPolygon' failed

    more complete log as belows: Epoch: [1 | 600] /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) (1/374) LR: 0.001000 | Batch: 2.668s | Total: 0min | ETA: 17min | Loss: 1.619 | Loss(text/kernel/emb/rec): 0.680/0.193/0.746/0.000 | IoU(text/kernel): 0.324/0.335 | Acc rec: 0.000 Traceback (most recent call last): File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/queues.py", line 236, in _feed obj = _ForkingPickler.dumps(obj) File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    the code runs normally when using the CTW1500 datasets. but encounter errors when using my own datasets.

    it seems fine in the first run (1/374), what is wrong ? I have no idea.

    opened by Zhang-O 5
  • 关于训练的问题

    关于训练的问题

    您好!我现在在自己的数据上进行训练,训练过程是这样的 image Epoch: [212 | 600] (1/198) LR: 0.000677 | Batch: 3.934s | Total: 0min | ETA: 13min | Loss: 0.752 | Loss(text/kernel/emb/rec): 0.493/0.199/0.059/0.000 | IoU(text/kernel): 0.055/0.553 | Acc rec: 0.000 (21/198) LR: 0.000677 | Batch: 1.089s | Total: 0min | ETA: 3min | Loss: 0.731 | Loss(text/kernel/emb/rec): 0.478/0.199/0.054/0.000 | IoU(text/kernel): 0.048/0.482 | Acc rec: 0.000 (41/198) LR: 0.000677 | Batch: 1.022s | Total: 1min | ETA: 3min | Loss: 0.732 | Loss(text/kernel/emb/rec): 0.478/0.198/0.056/0.000 | IoU(text/kernel): 0.049/0.476 | Acc rec: 0.000 这个Acc rec一直是0,我终止训练后,在测试数据上进行测试时,output输出的是空的,请问是怎么回事呢,感谢啦!

    opened by mayidu 3
  • 关于后处理的疑问

    关于后处理的疑问

    1. 后处理的代码中当kernel中两个连通域的面积比大于max_rate时,将这两个连通域的flag赋值为1,在扩充时,必须同时满足当前扩充的点所属的连通域的flag值为1且与kernal的similar vector距离大于3时才不扩充该点。请问设flag这步操作的作用是什么,直接判断与Kernel的similar vector的距离可以吗?
    2. 论文中扩充的点与kernel相似向量的欧式距离thresh值为6,代码中为3,请问实际应用中这个值跟什么有关系,是数据集的某些特点吗?
    opened by jewelc92 3
  • Regarding pa.pyx

    Regarding pa.pyx

    Hi,

    I try to run your code and figure out that in your last line in pa.pyx

    return _pa(kernels[:-1], emb, label, cc, kernel_num, label_num, min_area)

    Looks like this should be

    return _pa(kernels, emb, label, cc, kernel_num, label_num, min_area)

    So that we can scan over all kernels (you skip the last kernel) and there is no crash in this function. Am I correct?

    Thanks.

    opened by liuch37 3
  • AttributeError: 'Namespace' object has no attribute 'resume'

    AttributeError: 'Namespace' object has no attribute 'resume'

    PAN++ic15,An error appears when trying to test the model:

    reading type: pil. Traceback (most recent call last): File "test.py", line 155, in main(args) File "test.py", line 138, in main print("No checkpoint found at '{}'".format(args.resume)) AttributeError: 'Namespace' object has no attribute 'resume'

    opened by lrjj 2
  • 训练Total Text时遇到的问题

    训练Total Text时遇到的问题

    运行 python train.py config/pan/pan_r18_tt.py 后,出现如下情况: p1 Traceback (most recent call last): File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/queues.py", line 234, in _feed obj = _ForkingPickler.dumps(obj) File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed 似乎是迭代过程中出现的问题且只出现在训练TT数据集的时候 请问出现这种情况该怎样解决呢?谢谢您

    opened by mashumli 2
  • 执行test.py提示TypeError: 'module' object is not callable

    执行test.py提示TypeError: 'module' object is not callable

    将模型路径和config文件路径配置好了之后,执行python test.py,提示如下: Traceback (most recent call last): File "test.py", line 117, in main(args) File "test.py", line 107, in main test(test_loader, model, cfg) File "test.py", line 56, in test outputs = model(**data) File "/home/ethony/anaconda3/envs/ocr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/pan.py", line 104, in forward det_res = self.det_head.get_results(det_out, img_metas, cfg) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/head/pa_head.py", line 65, in get_results label = pa(kernels, emb) TypeError: 'module' object is not callable 看提示应该是model/post_processing下的pa没有正确导入,导入为模块了,这应该怎么解决呢

    opened by ethanlighter 2
  • problems in train.py

    problems in train.py

    Hi. When I run 'python train.py config/pan/pan_r18_ic15.py' , the errors are as followings: Do you know how to solve the problem? Thank you very much. Traceback (most recent call last): File "train.py", line 234, in main(args) File "train.py", line 216, in main train(train_loader, model, optimizer, epoch, start_iter, cfg) File "train.py", line 41, in train for iter, data in enumerate(train_loader): File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next data = self._next_data() File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data return self._process_data(data) File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data data.reraise() File "D:\Anaconda3\lib\site-packages\torch_utils.py", line 428, in reraise raise self.exc_type(msg) TypeError: function takes exactly 5 arguments (1 given)

    opened by YUDASHUAI916 2
  • not sure about run compile.sh

    not sure about run compile.sh

    (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$ sh ./compile.sh Compiling pa.pyx because it depends on /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/init.pxd. [1/1] Cythonizing pa.pyx /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.pyx tree = Parsing.p_module(s, pxd, full_module_name) running build_ext building 'pa' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include -I/data/tools/anaconda3/envs/zyl_torch16/include/python3.7m -c pa.cpp -o build/temp.linux-x86_64-3.7/pa.o -O3 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4, from pa.cpp:647: /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it with "
    ^~~~~~~ g++ -pthread -shared -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -L/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,-rpath=/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/pa.o -o /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.cpython-37m-x86_64-linux-gnu.so (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$

    this is the compile history, I am not sure whether is successully build or not.

    opened by Zhang-O 2
  • morphology operations from kornia

    morphology operations from kornia

    Hi,

    Your FAST paper is really amazing! While you already have an implementation of erosion/dilation, let me offer using our set of morphology, implemented in pyre pytorch: https://kornia.readthedocs.io/en/latest/morphology.html

    https://kornia-tutorials.readthedocs.io/en/master/morphology_101.html

    Best, Dmytro.

    opened by ducha-aiki 1
  • The sample 199 not present in GT

    The sample 199 not present in GT

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by zeng-cy 1
  • How  to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar cd eval/ ./eval_ic15.sh

    please inform me with [email protected] or wechat SanQian-2012,thanks you so much.

    Originally posted by @Devin521314 in https://github.com/whai362/pan_pp.pytorch/issues/91#issuecomment-1233810612

    opened by Devin521314 0
  • Why rec encoder use EOS? not SOS

    Why rec encoder use EOS? not SOS

    hi: I find there is no 'SOS' in code, I understand SOS should be embedding at the beginning. Please tell me ,thanks! ---------------code----------------------------------------------- class Encoder(nn.Module): def init(self, hidden_dim, voc, char2id, id2char): super(Encoder, self).init() self.hidden_dim = hidden_dim self.vocab_size = len(voc) self.START_TOKEN = char2id['EOS'] self.emb = nn.Embedding(self.vocab_size, self.hidden_dim) self.att = MultiHeadAttentionLayer(self.hidden_dim, 8)

    def forward(self, x):
        batch_size, feature_dim, H, W = x.size()
        x_flatten = x.view(batch_size, feature_dim, H * W).permute(0, 2, 1)
        st = x.new_full((batch_size,), self.START_TOKEN, dtype=torch.long)
        emb_st = self.emb(st)
        holistic_feature, _ = self.att(emb_st, x_flatten, x_flatten)
        return 
    
    opened by Patickk 0
Releases(v1)
Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

TableParser Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at DS3 Lab 11 Dec 13, 2022

Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions

Natural Posterior Network This repository provides the official implementation o

Oliver Borchert 54 Dec 06, 2022
A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning.

Open3DSOT A general python framework for single object tracking in LiDAR point clouds, based on PyTorch Lightning. The official code release of BAT an

Kangel Zenn 172 Dec 23, 2022
A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Orchard Dataset This repository contains the code used for generating the Orchard Dataset, as seen in the Multi-Hierarchical Reasoning in Sequences: S

Bill Pung 1 Jun 05, 2022
MlTr: Multi-label Classification with Transformer

MlTr: Multi-label Classification with Transformer This is official implement of "MlTr: Multi-label Classification with Transformer". Abstract The task

程星 38 Nov 08, 2022
Clean Machine Learning, a Coding Kata

Kata: Clean Machine Learning From Dirty Code First, open the Kata in Google Colab (or else download it) You can clone this project and launch jupyter-

Neuraxio 13 Nov 03, 2022
KaziText is a tool for modelling common human errors.

KaziText KaziText is a tool for modelling common human errors. It estimates probabilities of individual error types (so called aspects) from grammatic

ÚFAL 3 Nov 24, 2022
Post-training Quantization for Neural Networks with Provable Guarantees

Post-training Quantization for Neural Networks with Provable Guarantees Authors: Jinjie Zhang ( Yixuan Zhou 2 Nov 29, 2022

Datasets, tools, and benchmarks for representation learning of code.

The CodeSearchNet challenge has been concluded We would like to thank all participants for their submissions and we hope that this challenge provided

GitHub 1.8k Dec 25, 2022
OBBDetection: an oriented object detection toolbox modified from MMdetection

OBBDetection note: If you have questions or good suggestions, feel free to propose issues and contact me. introduction OBBDetection is an oriented obj

MIXIAOXIN_HO 3 Nov 11, 2022
Real-Time-Student-Attendence-System - Real Time Student Attendence System

Real-Time-Student-Attendence-System The Student Attendance Management System Pro

Rounak Das 1 Feb 15, 2022
Degree-Quant: Quantization-Aware Training for Graph Neural Networks.

Degree-Quant This repo provides a clean re-implementation of the code associated with the paper Degree-Quant: Quantization-Aware Training for Graph Ne

35 Oct 07, 2022
Use .csv files to record, play and evaluate motion capture data.

Purpose These scripts allow you to record mocap data to, and play from .csv files. This approach facilitates parsing of body movement data in statisti

21 Dec 12, 2022
Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Created by Olga Sutyrina, Sarah Elemili, Abduragim Shtanchaev and Artur Bille Individual Tree Crown classification on WorldView-2 Images using Autoenc

2 Dec 08, 2022
A Python library for Deep Graph Networks

PyDGN Wiki Description This is a Python library to easily experiment with Deep Graph Networks (DGNs). It provides automatic management of data splitti

Federico Errica 194 Dec 22, 2022
An educational AI robot based on NVIDIA Jetson Nano.

JetBot Looking for a quick way to get started with JetBot? Many third party kits are now available! JetBot is an open-source robot based on NVIDIA Jet

NVIDIA AI IOT 2.6k Dec 29, 2022
yolox_backbone is a deep-learning library and is a collection of YOLOX Backbone models.

YOLOX-Backbone yolox-backbone is a deep-learning library and is a collection of YOLOX backbone models. Install pip install yolox-backbone Load a Pret

Yonghye Kwon 21 Dec 28, 2022
This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML scenarios"

TinyWeaklyIsolationForest This repository stores the code to reproduce the results published in "TiWS-iForest: Isolation Forest in Weakly Supervised a

2 Mar 21, 2022
ICSS - Interactive Continual Semantic Segmentation

Presentation This repository contains the code of our paper: Weakly-supervised c

Alteia 9 Jul 23, 2022
Implementation of Stochastic Image-to-Video Synthesis using cINNs.

Stochastic Image-to-Video Synthesis using cINNs Official PyTorch implementation of Stochastic Image-to-Video Synthesis using cINNs accepted to CVPR202

CompVis Heidelberg 135 Dec 28, 2022