基于Paddle框架的PSENet复现

Overview

PSENet-Paddle

基于Paddle框架的PSENet复现

本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待

AIStudio链接

参考项目:

whai362-PSENet

环境配置

本项目利用AIstudio平台,采用paddlepaddle: 2.0.2-gpu Version,除此之外你需要通过pip install mmcv editdistance Polygon3 pyclipper或者pip install -r requirement.txt来安装依赖包

数据集

本项目已搭载PSENet比赛指定数据集,你可以在此找到搭载的数据集,包含ICDAR2015 Task4以及Total-Text

工程目录

注意到你需要将submitPSENet重命名为PSENet

/home/aistudio/PSENet
|───data(解压的data.zip)
└───config
└───models
└───dataset
└───eval
└───utils
└───compile.sh
└───__init__.py
└───test.py
└───train.py
└───requirement.txt
└───logo.gif

项目配置**

注意:由于aistudio的docker环境并不适配本项目的编译,所以你需要在本地计算机编译完成后上传编译文件,在本地计算机我才用如下配置,你可以使用gcc --versiong++ --version查看配置

AIStudio Local PC
gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

可以发现AIStudio的g++版本不适配,注意:你需要相同的架构,系统以及python版本,(Ubuntu)linux-x86_64&python3.7

`./compile.sh` or `bash compile.sh` if come out bash: ./compile.sh: Permission denied

或者直接进入指定目录,手动编译

cd /home/aistudio/PSENet/models/post_processing/pse
python setup.py build_ext --inplace

编译完成后你会在/home/aistudio/PSENet/models/post_processing/pse得到build/temp.linux-x86_64-3.7/pse.o文件和pse.cpython-37m-x86_64-linux-gnu.so文件

注意:本项目已经全部配置完成,这一步无需操作

训练

需要注意的是,在paddlepaddle-2.0.2中并不支持字典数据读取,因此我在/home/aistudio/PSENet/utils/data_loader.py利用迭代器重写了DataLoader这拉慢了数据读取的速度,会导致训练速度略慢,例如在使用psenet_r50_ic15_1024_finetune.py训练一个epoch需要512.4秒,另外paddlepaddle2.0.2暂不支持Identity方法,因此我在/home/aistudio/PSENet/models/utils/fuse_conv_bn.py通过继承Paddle.nn.Layer写了Identity

cd /home/aistudio/PSENet/
python train.py ${CONFIG_FILE}

例如:

cd /home/aistudio/PSENet/
python train.py config/psenet/psenet_r50_ic15_736.py

训练开启时,会生成一个类似/home/aistudio/PSENet/checkpoints/psenet_r50_ic15_1024_finetune的文件夹,里面将保存权重和优化器参数

测试

cd /home/aistudio/PSENet/
python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}

例如:

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams

评估

你需要注意的是:测试和评估是递进的,通过测试生成文件后,进行评估

ICDAR 2015

cd /home/aistudio/PSENet/eval
`./eval_ic15.sh` or `bash ./eval_ic15.sh`

你会得到如下类似信息:

Calculated!{"precision": 0.8620689655172413, "recall": 0.7944150216658642, "hmean": 0.826860435980957, "AP": 0}

以下是paddlepaddle预训练模型测试指标

Method Backbone Fine-tuning Scale Config Precision (%) Recall (%) F-measure (%) Model
PSENet ResNet50 N Shorter Side: 736 psenet_r50_ic15_736.py 83.6 74.0 78.5 checkpoint_ic15_736
PSENet ResNet50 N Shorter Side: 1024 psenet_r50_ic15_1024.py 84.4 76.3 80.2 checkpoint_ic15_1024
PSENet ResNet50 Y Shorter Side: 736 psenet_r50_ic15_736_finetune.py 85.3 76.8 80.9 checkpoint_ic15_736_finetune
PSENet ResNet50 Y Shorter Side: 1024 psenet_r50_ic15_1024_finetune.py 86.2 79.4 82.7 checkpoint_ic15_1024_finetune

Total-Text

Text detection

cd /home/aistudio/PSENet/eval
./eval_tt.sh or `bash ./eval_tt.sh`

你会得到如下类似信息:

Precision:_0.8727937336814604_______/Recall:_0.7786751361161512/Hmean:_0.8230524859472805

pb

以下是paddlepaddle预训练模型测试指标

Method Backbone Fine-tuning Config Precision (%) Recall (%) F-measure (%) Model
PSENet ResNet50 N psenet_r50_tt.py 87.3 77.9 82.3 checkpoint_tt
PSENet ResNet50 Y psenet_r50_tt_finetune.py 89.3 79.6 84.2 checkpoint_tt_finetune

速度测试

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

例如:

cd /home/aistudio/PSENet/
python test.py config/psenet/psenet_r50_ic15_736.py PSENet/PretrainedModel/checkpoint_ic15_736.pdparams --report_speed

你会得到如下类似信息

Testing 283/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 284/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 285/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8
Testing 286/3000
backbone_time: 0.0152
neck_time: 0.0029
det_head_time: 0.0005
det_pse_time: 0.0660
FPS: 11.8

Citation

@inproceedings{wang2019shape,
  title={Shape robust text detection with progressive scale expansion network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}
Owner
QuanHao Guo
master at UESTC
QuanHao Guo
Web interface for browsing arXiv papers

Currently, arxivbox considers only major computer vision and machine learning conferences

Ankan Kumar Bhunia 12 Sep 11, 2022
かの有名なあの東方二次創作ソング、「bad apple!」のMVをPythonでやってみたって話

bad apple!! 内容 このプログラムは、bad apple!(feat. nomico)のPVをPythonを用いて再現しよう!という内容です。 実はYoutube並びにGithub上に似たようなプログラムがあったしなんならそっちの方が結構良かったりするんですが、一応公開しますw 使い方 こ

赤紫 8 Jan 05, 2023
Textboxes implementation with Tensorflow (python)

tb_tensorflow A python implementation of TextBoxes Dependencies TensorFlow r1.0 OpenCV2 Code from Chaoyue Wang 03/09/2017 Update: 1.Debugging optimize

Jayne Shin (신재인) 20 May 31, 2019
Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

Maciej Budyś 170 Dec 27, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

Fusformer Code for the paper: "Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution" Plateform Python 3.8.5 + Pytor

Jin-Fan Hu (胡锦帆) 11 Dec 12, 2022
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition PDF Abstract Explainable artificial intelligence has been gaining attention

87 Dec 26, 2022
Python Computer Vision from Scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both f

Milaan Parmar / Милан пармар / _米兰 帕尔马 221 Dec 26, 2022
Image augmentation for machine learning experiments.

imgaug This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much lar

Alexander Jung 13.2k Jan 02, 2023
This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022
Detect the mathematical formula from the given picture and the same formula is extracted and converted into the latex code

Mathematical formulae extractor The goal of this project is to create a learning based system that takes an image of a math formula and returns corres

6 May 22, 2022
Hiiii this is the Spanish for Linux and win 10 and in the near future the english version of PortScan my new tool on which you can see what ports are Open only with the IP adress.

PortScanner-by-IIT PortScanner es una herramienta programada en Python3. Como su nombre indica esta herramienta escanea los primeros 150 puertos de re

5 Sep 19, 2022
Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Abdulazeez Jimoh 1 Jan 01, 2022
Textboxes_plusplus implementation with Tensorflow (python)

TextBoxes++-TensorFlow TextBoxes++ re-implementation using tensorflow. This project is greatly inspired by slim project And many functions are modifie

81 Dec 07, 2022
Characterizing possible failure modes in physics-informed neural networks.

Characterizing possible failure modes in physics-informed neural networks This repository contains the PyTorch source code for the experiments in the

Aditi Krishnapriyan 55 Jan 02, 2023
A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV.

DcoumentScanner A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV. Directly install the .exe file to inst

Harsh Vardhan Singh 1 Oct 29, 2021
Play the Namibian game of Owela against a terrible AI. Built using Django and htmx.

Owela Club A Django project for playing the Namibian game of Owela against a dumb AI. Built following the rules described on the Mancala World wiki pa

Adam Johnson 18 Jun 01, 2022
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 08, 2023
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
YOLOv5 in DOTA with CSL_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)

YOLOv5_DOTA_OBB YOLOv5 in DOTA_OBB dataset with CSL_label.(Oriented Object Detection) Datasets and pretrained checkpoint Datasets : DOTA Pretrained Ch

1.1k Dec 30, 2022