利用Paddle框架复现CRAFT

Overview

CRAFT-Paddle

利用Paddle框架复现CRAFT

CRAFT

本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待

参考项目:

CRAFT: Character-Region Awareness For Text detection

项目配置

pip install -r requirements.txt

你应该具有以下目录

/home/aistudio/CRAFT(工程目录)
/home/aistudio/Data(数据集文件)

数据集文件已挂载,自行解压即可

训练

The code for training is not included in this repository, and we cannot release the full training code for IP reason.

作者并未提供训练代码

权重转换

这里用到了X2Paddle神器,转换代码如下,具体使用文档参见X2Paddle

from craft import CRAFT
import torch
from collections import OrderedDict
import imgproc
import numpy as np
import cv2

def copyStateDict(state_dict):
    if list(state_dict.keys())[0].startswith("module"):
        start_idx = 1
    else:
        start_idx = 0
    new_state_dict = OrderedDict()
    for k, v in state_dict.items():
        name = ".".join(k.split(".")[start_idx:])
        new_state_dict[name] = v
    return new_state_dict

# 构建输入
input_data = np.random.rand(1, 3, 736, 1280).astype("float32")
net = CRAFT()
net.load_state_dict(copyStateDict(torch.load('craft_mlt_25k.pth')))
net = net.cuda()
net.eval()

# 进行转换
from x2paddle.convert import pytorch2paddle
pytorch2paddle(net, 
          save_dir="paddlemodel", 
          jit_type="trace", 
          input_examples=[torch.tensor(input_data).cuda()])

完成后你会出现如下文件目录

/home/aistudio/CRAFT/paddlemodel
└───inference_model
└──────model.pdiparams
└──────model.pdiparams.info
└──────model.pdmodel
└───model.pdparams
└───x2paddle_code.py

使用同样的方式转换refinenet

测试

模型下载

提取码:4yy1

AIStudio链接

cd /home/aistudio/CRAFT
python test.py

Model name Used datasets Languages Purpose Model Link
General SynthText, IC13, IC17 Eng + MLT For general purpose craft_mlt_25k
IC15 SynthText, IC15 Eng For IC15 only craft_ic15_20k
LinkRefiner CTW1500 - Used with the General Model craft_refiner_CTW1500

下图是实际测试效果

评估

可以采用以下代码进行评估

cd /home/aistudio/CRAFT
python eval.py
cd /home/aistudio/CRAFT/outputs/submit_ic15/
zip ../submit_ic15.zip *
cd /home/aistudio/CRAFT/eval
`./eval_ic15.sh` or `bash eval_ic15.sh`
Method Dataset Backbone refiner Precision (%) Recall (%) F-measure (%) Model
basenet ICDAR2015 VGG16_BN N 82.2 77.9 80.0 craft_ic15_20k
basenet ICDAR2015 VGG16_BN N 85.1 79.4 82.2 craft_mlt_25k
basenet ICDAR2015 VGG16_BN Y 61.9 45.1 52.2 craft_ic15_20k
basenet ICDAR2015 VGG16_BN Y 63.1 43.3 51.4 craft_mlt_25k

评估total_text数据集可参见我的PSNET项目eval文件价下的评估代码

关于作者

姓名 郭权浩
学校 电子科技大学研2020级
研究方向 计算机视觉
主页 Deep Hao的主页
如有错误,请及时留言纠正,非常蟹蟹!
后续会有更多论文复现系列推出,欢迎大家有问题留言交流学习,共同进步成长!
Owner
QuanHao Guo
master at UESTC
QuanHao Guo
Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Paper source Arbitrary-Oriented Scene Text Detection via Rotation Proposals https://arxiv.org/abs/1703.01086 News We update RRPN in pytorch 1.0! View

428 Nov 22, 2022
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

CUTIE TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohu

Zhao,Xiaohui 147 Dec 20, 2022
fishington.io bot with OpenCV and NumPy

fishington.io-bot fishington.io bot with using OpenCV and NumPy bot can continue to fishing fully automatically how to use Open cmd in fishington.io-b

Bahadır Araz 77 Jan 02, 2023
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
Steve Tu 71 Dec 30, 2022
原神风花节自动弹琴辅助

GenshinAutoPlayBalladsofBreeze 原神风花节自动弹琴辅助(已适配1920*1080分辨率) 本程序基于opencv图像识别技术,不存在任何封号。 因为正确率取决于你的cpu性能,10900k都不一定全对。 由于图像识别存在误差,根本无法确定出错时间。更不用说被检测到了。

晓轩 20 Oct 27, 2022
Face Anonymizer - FaceAnonApp v1.0

Face Anonymizer - FaceAnonApp v1.0 Blur faces from image and video files in /data/files folder. Contents Repo of the source files for the FaceAnonApp.

6 Apr 18, 2022
scene-linear test images

Scene-Referred Image Collection A collection of OpenEXR Scene-Referred images, encoded as max 2048px width, DWAA 80 compression. All exrs are encoded

Gralk Klorggson 7 Aug 25, 2022
Automatic Number Plate Recognition (ANPR) is a highly accurate system capable of reading vehicle number plates without human intervention

ANPR ANPR is therefore the underlying technology used to find a vehicle license/number plate and it, in turn, supplies this information to a next stag

Melih Emin Kılıçoğlu 1 Jan 09, 2022
🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

Charset Detection, for Everyone 👋 The Real First Universal Charset Detector A library that helps you read text from an unknown charset encoding. Moti

TAHRI Ahmed R. 332 Dec 31, 2022
Machine Leaning applied to denoise images to improve OCR Accuracy

Machine Learning to Denoise Images for Better OCR Accuracy This project is an adaptation of this tutorial and used only for learning purposes: https:/

Antonio Bri Pérez 2 Nov 16, 2022
Um RPG de texto orientado a objetos.

RPG de texto Um RPG de texto orientado a objetos, sem história. Um RPG (Role-playing game) baseado em texto em que você pode viajar para alguns locais

Vinicius 3 Oct 05, 2022
A little but useful tool to explore OCR data extracted with `pytesseract` and `opencv`

Screenshot OCR Tool Extracting data from screen time screenshots in iOS and Android. We are exploring 3 options: Simple OCR with no text position usin

Gabriele Marini 1 Dec 07, 2021
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 209 Dec 06, 2022
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

Hiromu OCHIAI 1.9k Dec 28, 2022
Pixie - A full-featured 2D graphics library for Python

Pixie - A full-featured 2D graphics library for Python Pixie is a 2D graphics library similar to Cairo and Skia. pip install pixie-python Features: Ty

treeform 65 Dec 30, 2022
TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

Yukang Wang 101 Dec 12, 2022
Scan the MRZ code of a passport and extract the firstname, lastname, passport number, nationality, date of birth, expiration date and personal numer.

PassportScanner Works with 2 and 3 line identity documents. What is this With PassportScanner you can use your camera to scan the MRZ code of a passpo

Edwin Vermeer 441 Dec 24, 2022