基于Paddle框架的PSENet复现

Last update: Apr 24, 2022

Related tags

Computer Vision PSENet-Paddle

Overview

PSENet-Paddle

基于Paddle框架的PSENet复现

本项目基于paddlepaddle框架复现PSENet，并参加百度第三届论文复现赛，将在2021年5月15日比赛完后提供AIStudio链接～敬请期待

AIStudio链接

参考项目：

whai362-PSENet

环境配置

本项目利用AIstudio平台，采用paddlepaddle: 2.0.2-gpu Version，除此之外你需要通过pip install mmcv editdistance Polygon3 pyclipper或者pip install -r requirement.txt来安装依赖包

数据集

本项目已搭载PSENet比赛指定数据集，你可以在此找到搭载的数据集，包含ICDAR2015 Task4以及Total-Text

工程目录

注意到你需要将submitPSENet重命名为PSENet

/home/aistudio/PSENet
|───data(解压的data.zip)
└───config
└───models
└───dataset
└───eval
└───utils
└───compile.sh
└───__init__.py
└───test.py
└───train.py
└───requirement.txt
└───logo.gif

项目配置**

注意：由于aistudio的docker环境并不适配本项目的编译，所以你需要在本地计算机编译完成后上传编译文件，在本地计算机我才用如下配置，你可以使用gcc --version和g++ --version查看配置

AIStudio	Local PC
gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Copyright (C) 2017 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

可以发现AIStudio的g++版本不适配，注意：你需要相同的架构，系统以及python版本，(Ubuntu)linux-x86_64&python3.7

`./compile.sh` or `bash compile.sh` if come out bash: ./compile.sh: Permission denied

或者直接进入指定目录，手动编译

cd /home/aistudio/PSENet/models/post_processing/pse
python setup.py build_ext --inplace

编译完成后你会在/home/aistudio/PSENet/models/post_processing/pse得到build/temp.linux-x86_64-3.7/pse.o文件和pse.cpython-37m-x86_64-linux-gnu.so文件

注意：本项目已经全部配置完成，这一步无需操作

训练

需要注意的是，在paddlepaddle-2.0.2中并不支持字典数据读取，因此我在/home/aistudio/PSENet/utils/data_loader.py利用迭代器重写了DataLoader这拉慢了数据读取的速度，会导致训练速度略慢，例如在使用psenet_r50_ic15_1024_finetune.py训练一个epoch需要512.4秒，另外paddlepaddle2.0.2暂不支持Identity方法，因此我在/home/aistudio/PSENet/models/utils/fuse_conv_bn.py通过继承Paddle.nn.Layer写了Identity类

cd /home/aistudio/PSENet/
python train.py ${CONFIG_FILE}

例如：

cd /home/aistudio/PSENet/
python train.py config/psenet/psenet_r50_ic15_736.py

训练开启时，会生成一个类似/home/aistudio/PSENet/checkpoints/psenet_r50_ic15_1024_finetune的文件夹，里面将保存权重和优化器参数

测试

cd /home/aistudio/PSENet/
python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}

例如：

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams

评估

你需要注意的是：测试和评估是递进的，通过测试生成文件后，进行评估

ICDAR 2015

cd /home/aistudio/PSENet/eval
`./eval_ic15.sh` or `bash ./eval_ic15.sh`

你会得到如下类似信息：

Calculated!{"precision": 0.8620689655172413, "recall": 0.7944150216658642, "hmean": 0.826860435980957, "AP": 0}

以下是paddlepaddle预训练模型测试指标

Method	Backbone	Fine-tuning	Scale	Config	Precision (%)	Recall (%)	F-measure (%)	Model
PSENet	ResNet50	N	Shorter Side: 736	psenet_r50_ic15_736.py	83.6	74.0	78.5	checkpoint_ic15_736
PSENet	ResNet50	N	Shorter Side: 1024	psenet_r50_ic15_1024.py	84.4	76.3	80.2	checkpoint_ic15_1024
PSENet	ResNet50	Y	Shorter Side: 736	psenet_r50_ic15_736_finetune.py	85.3	76.8	80.9	checkpoint_ic15_736_finetune
PSENet	ResNet50	Y	Shorter Side: 1024	psenet_r50_ic15_1024_finetune.py	86.2	79.4	82.7	checkpoint_ic15_1024_finetune

Total-Text

Text detection

cd /home/aistudio/PSENet/eval
./eval_tt.sh or `bash ./eval_tt.sh`

你会得到如下类似信息：

Precision:_0.8727937336814604_______/Recall:_0.7786751361161512/Hmean:_0.8230524859472805

pb

以下是paddlepaddle预训练模型测试指标

Method	Backbone	Fine-tuning	Config	Precision (%)	Recall (%)	F-measure (%)	Model
PSENet	ResNet50	N	psenet_r50_tt.py	87.3	77.9	82.3	checkpoint_tt
PSENet	ResNet50	Y	psenet_r50_tt_finetune.py	89.3	79.6	84.2	checkpoint_tt_finetune

速度测试

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

例如：

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams --report_speed

你会得到如下类似信息

Testing 283/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 284/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 285/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 286/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8

Citation

@inproceedings{wang2019shape,
  title={Shape robust text detection with progressive scale expansion network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

基于Paddle框架的PSENet复现

Related tags

Overview

PSENet-Paddle

环境配置

数据集

工程目录

项目配置**

训练

测试

评估

ICDAR 2015

Total-Text

速度测试

Citation

Owner

QuanHao Guo

Memory tests solver with using OpenCV

Creating of virtual elements of the graphical interface using opencv and mediapipe.

make a better chinese character recognition OCR than tesseract

Maze generator and solver with python

Face Detection with DLIB

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

Primary QPDF source code and documentation

Repository of conference publications and source code for first-/ second-authored papers published at NeurIPS, ICML, and ICLR.

POT : Python Optimal Transport

OpenMMLab Text Detection, Recognition and Understanding Toolbox

1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge

Text layer for bio-image annotation.

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

原神风花节自动弹琴辅助

Textboxes_plusplus implementation with Tensorflow (python)

Face Recognizer using Opencv Python

An Implementation of the FOTS: Fast Oriented Text Spotting with a Unified Network

An application of high resolution GANs to dewarp images of perturbed documents