一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

Last update: Nov 10, 2022

Overview

AgentOCR

简介

AgentOCR 是一个基于 PaddleOCR 和 ONNXRuntime 项目开发的一个使用简单、调用方便的 OCR 项目
本项目目前包含 Python Package 【AgentOCR】 和 OCR 标注软件 【AgentOCRLabeling】

使用指南

Python Package：

快速安装：

# 安装 AgentOCR
$ pip install agentocr 

# 根据设备平台安装合适版本的 ONNXRuntime
$ pip install onnxruntime

简单调用：

# 导入 OCRSystem 模块
from agentocr import OCRSystem

# 初始化 OCR 模型
ocr = OCRSystem(config='ch')

# 使用模型对图像进行 OCR 识别
results = ocr.ocr('test.jpg')

服务器部署：

启动 AgentOCR Server 服务
```
$ agentocr server
```

Python 调用

import cv2
import json
import base64
import requests

# 图片 Base64 编码
def cv2_to_base64(image):
    data = cv2.imencode('.jpg', image)[1]
    image_base64 = base64.b64encode(data.tobytes()).decode('UTF-8')
    return image_base64


# 读取图片
image = cv2.imread('test.jpg')
image_base64 = cv2_to_base64(image)

# 构建请求数据
data = {
    'image': image_base64
}

# 发送请求
url = "http://127.0.0.1:5000/ocr"
r = requests.post(url=url, data=json.dumps(data))

# 打印预测结果
print(r.json())

Jupyter Notebook：【快速使用】
更多安装使用细节请参考：【Package 使用指南】

多语言支持

目前预置了如下语言的配置文件，可通过语言缩写直接调用该配置文件：

语种	描述	缩写	语种	描述	缩写
中文	chinese and english	ch	保加利亚文	Bulgarian	bg
英文	english	en	乌克兰文	Ukranian	uk
法文	french	fr	白俄罗斯文	Belarusian	be
德文	german	german	泰卢固文	Telugu	te
日文	japan	japan	阿巴扎文	Abaza	abq
韩文	korean	korean	泰米尔文	Tamil	ta
中文繁体	chinese traditional	cht	南非荷兰文	Afrikaans	af
意大利文	Italian	it	阿塞拜疆文	Azerbaijani	az
西班牙文	Spanish	es	波斯尼亚文	Bosnian	bs
葡萄牙文	Portuguese	pt	捷克文	Czech	cs
俄罗斯文	Russia	ru	威尔士文	Welsh	cy
阿拉伯文	Arabic	ar	丹麦文	Danish	da
印地文	Hindi	hi	爱沙尼亚文	Estonian	et
维吾尔	Uyghur	ug	爱尔兰文	Irish	ga
波斯文	Persian	fa	克罗地亚文	Croatian	hr
乌尔都文	Urdu	ur	匈牙利文	Hungarian	hu
塞尔维亚文（latin)	Serbian(latin)	rs_latin	印尼文	Indonesian	id
欧西坦文	Occitan	oc	冰岛文	Icelandic	is
马拉地文	Marathi	mr	库尔德文	Kurdish	ku
尼泊尔文	Nepali	ne	立陶宛文	Lithuanian	lt
塞尔维亚文（cyrillic)	Serbian(cyrillic)	rs_cyrillic	拉脱维亚文	Latvian	lv
毛利文	Maori	mi	达尔瓦文	Dargwa	dar
马来文	Malay	ms	因古什文	Ingush	inh
马耳他文	Maltese	mt	拉克文	Lak	lbe
荷兰文	Dutch	nl	莱兹甘文	Lezghian	lez
挪威文	Norwegian	no	塔巴萨兰文	Tabassaran	tab
波兰文	Polish	pl	比尔哈文	Bihari	bh
罗马尼亚文	Romanian	ro	迈蒂利文	Maithili	mai
斯洛伐克文	Slovak	sk	昂加文	Angika	ang
斯洛文尼亚文	Slovenian	sl	孟加拉文	Bhojpuri	bho
阿尔巴尼亚文	Albanian	sq	摩揭陀文	Magahi	mah
瑞典文	Swedish	sv	那格浦尔文	Nagpur	sck
西瓦希里文	Swahili	sw	尼瓦尔文	Newari	new
塔加洛文	Tagalog	tl	保加利亚文	Goan Konkani	gom
土耳其文	Turkish	tr	沙特阿拉伯文	Saudi Arabia	sa
乌兹别克文	Uzbek	uz	阿瓦尔文	Avar	ava
越南文	Vietnamese	vi	阿瓦尔文	Avar	ava
蒙古文	Mongolian	mn	阿迪赫文	Adyghe	ady

预训练模型

检测模型：

Model Name	Model Type	Pretrained Model
ch_ppocr_mobile_v2.0_det	det	Download
ch_ppocr_server_v2.0_det	det	Download
en_ppocr_mobile_v2.0_det	det	Download
en_ppocr_mobile_v2.0_table_det	det	Download

分类模型：

Model Name Model Type Pretrained Model

ch_ppocr_mobile_v2.0_cls cls Download

Model Name	Model Type	Pretrained Model
ch_ppocr_mobile_v2.0_cls	cls	Download

识别模型：

Model Name	Model Type	Pretrained Model
ch_ppocr_mobile_v2.0_rec	rec	Download
ch_ppocr_server_v2.0_rec	rec	Download
ka_ppocr_mobile_v2.0_rec	rec	Download
te_ppocr_mobile_v2.0_rec	rec	Download
ta_ppocr_mobile_v2.0_rec	rec	Download
cht_ppocr_mobile_v2.0_rec	rec	Download
japan_ppocr_mobile_v2.0_rec	rec	Download
latin_ppocr_mobile_v2.0_rec	rec	Download
arabic_ppocr_mobile_v2.0_rec	rec	Download
korean_ppocr_mobile_v2.0_rec	rec	Download
french_ppocr_mobile_v2.0_rec	rec	Download
german_ppocr_mobile_v2.0_rec	rec	Download
cyrillic_ppocr_mobile_v2.0_rec	rec	Download
en_ppocr_mobile_v2.0_table_rec	rec	Download
en_ppocr_mobile_v2.0_number_rec	rec	Download
devanagari_ppocr_mobile_v2.0_rec	rec	Download

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

125 Jan 4, 2023

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

79 Dec 28, 2022

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

XL-Sum This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Lang

190 Jan 3, 2023

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S

19 Nov 30, 2022

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

One model to speak them all 🌎 Audio Language Text ▷ Chinese 人人生而自由，在尊严和权利上一律平等。 ▷ English All human beings are born free and equal in dignity and rig

60 Nov 14, 2022

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021 The code for training mCOLT/mRASP2, a multilingua

104 Jan 1, 2023

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

298 Jan 7, 2023

A multilingual version of MS MARCO passage ranking dataset

mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t

75 Dec 27, 2022

Comments

linux下的预测速度是否与windows存在差异
请问大佬有没有测试过在linux运行agentocr与windows下的性能差异，感觉速度差距有点大，条件限制没办法比对完全一样的硬件环境，只是猜测是不是和系统有关？

以下环境都是基于python3.7.10，agentocr 1.2.0，预测同一张图片

本地环境是笔记本电脑，win10，CPU是AMD Ryzen 7 5800H，8核16线程，预测得到结果耗时是2.5秒以内

一台Linux服务器，Centos7，是由Intel(R) Xeon(R) CPU E5-2680 v4划出来的4核虚拟机，预测得到结果耗时是在7.2秒以内

由上面笔记本电脑运行的VirtualBOX划分了4个CPU(VB上面显示有16个CPU，猜测应该是划分了4个核心线程出来)的虚拟机，Centos7，预测得到结果耗时也和第二点的linux服务器接近

一台windows服务器，winserver2012，是由Intel i7-8700划出来的2核虚拟机，预测得到结果耗时是在4.7秒以内

从任务运行情况来看，windows环境下在任务管理器可以看出，预测过程中所有核心都是参与工作的而linux环境通过top命令能看出CPU占用最高只能到200%，理论上4核心应该能到400%，是不是所有核心没有参与工作导致预测速度比较慢？条件有限，笔记本的CPU和台式服务器的CPU也没有直接的性能比较可以参考，但即便是比较旧的服务器CPU也不会跟7nm的AMD笔记本CPU有这么大差距吧，如果有大佬们测试过或者知道原因希望能告知一下！！
opened by w-Bro 10
【PaddlePaddle Hackathon】100 制作 Rubick 深度学习相关小插件
（此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE，更多详见PaddlePaddle Hackathon）

【任务说明】

任务标题：制作 Rubick 深度学习相关小插件

难度：中等（通过验收即可获得5000RMB）

技术标签：JavaScript、PaddlePaddle

详细描述：随着 Rubick、Utools 等高质量桌面效能工具箱的出现，使用深度学习进行赋能将会为其带来更多有趣的玩法。在本任务中，您可以借助 AgentOCR 或其他飞桨相关深度学习工具，结合 Paddle.JS 或 ONNX.JS 将深度学习模型以 Rubick 插件形式进行部署，例如使用 AgentOCR 的 OCR 能力让 Rubick 的截图拥有文字识别能力，当然你也可以选择自己喜欢的模型为 Rubick 进行赋能，只要以 Rubick 的插件形式进行开发即可视为有效提交。

Paddle.JS 主页：https://github.com/PaddlePaddle/Paddle.js

AgentOCR 主页：https://github.com/AgentMaker/AgentOCR

【提交内容】

项目 PR 到 AgentOCR

技术说明文档

【技术要求】

具备的 JavaScript 开发能力

PaddlePaddle Hackathon
opened by GT-ZhangAcer 0
【PaddlePaddle Hackathon】99 为 AgentOCR 工具适配 JavaScript 环境
（此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE，更多详见PaddlePaddle Hackathon）

【任务说明】

任务标题：为 AgentOCR 工具适配 JavaScript 环境

技术标签：JavaScript

任务难度：简单

详细描述：在 Web 前端以及、移动端 APP 开发甚至是桌面应用开发中， JavaScript 所体现的强大兼容性使得跨平台应用更加便捷。目前 AgentOCR 提供了飞桨 PaddlePaddle、ONNX、DML 三种后端支持，为更方便让基于 PaddleOCR 的 AgentOCR 更好适配更多开发者所需环境，我们可以通过不限于 Paddle.JS、ONNX.JS 中任一方式使得其支持JavaScript的OCR推理功能。本这个项目中，你需要在精度损失和速度损失较低的情况下制作 Paddle.JS 或 ONNX.JS 版本的 AgentOCR 开发程序包。

Paddle.JS 主页：https://github.com/PaddlePaddle/Paddle.js

AgentOCR 主页：https://github.com/AgentMaker/AgentOCR

【提交内容】

项目 PR 到 AgentOCR

技术说明文档

【技术要求】

具备的 JavaScript 开发能力

PaddlePaddle Hackathon
opened by GT-ZhangAcer 0

Releases(2.0.0)

2.0.0(Sep 29, 2021)
注意：

2.x 版本与 1.x 版本的模型文件互不兼容

更新：

新增 PaddleOCR v2 模型

优化识别模型字典

删除内置字体和 JSON 配置文件

多语言支持从使用具体语言切换更换为语言类型切换

Wheel 包体积缩小至 100k 左右

添加中国车牌检测识别子项目【AgentCLPR】

OCR 标注软件添加更多语言文本支持

多平台的可执行标注软件【Coming soon】

OCR 图形界面【Coming soon】

Source code(tar.gz)
Source code(zip)
agentocr-2.0.0-py3-none-any.whl(105.09 KB)
1.3.0(Sep 2, 2021)
优化识别代码，对齐识别模型精度

Source code(tar.gz)
Source code(zip)
agentocr-1.3.0-py3-none-any.whl(12.27 MB)
1.2.0(Aug 23, 2021)
新增服务器部署功能

修复 API 接口关闭检测时的 bug

增加 API 接口注释

Source code(tar.gz)
Source code(zip)
agentocr-1.2.0-py3-none-any.whl(12.27 MB)
1.1.3(Aug 21, 2021)
优化命令行功能

调整代码目录名称

Source code(tar.gz)
Source code(zip)
agentocr-1.1.3-py3-none-any.whl(12.27 MB)
1.1.2(Aug 20, 2021)
优化 log 信息

删除无用的配置选项

更新文档

Source code(tar.gz)
Source code(zip)
agentocr-1.1.2-py3-none-any.whl(12.28 MB)
1.1.1(Aug 20, 2021)
可通过 API 直接覆盖配置选项

将分类默认设为关闭

Source code(tar.gz)
Source code(zip)
agentocr-1.1.1-py3-none-any.whl(12.28 MB)
1.0.0(Aug 18, 2021)
初始版本

Source code(tar.gz)
Source code(zip)
agentocr-1.0.0-py3-none-any.whl(12.27 MB)

Owner

AgentMaker

Focus on deep learning tools

GitHub Repository

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021) An efficient PyTorch library for Point Cloud Completion.

119 Jan 02, 2023

A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

TADDY: Anomaly detection in dynamic graphs via transformer This repo covers an reference implementation for the paper "Anomaly detection in dynamic gr

21 Nov 24, 2022

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

270 Dec 31, 2022

Bling's Object detection tool

BriVL for Building Applications This repo is used for illustrating how to build applications by using BriVL model. This repo is re-implemented from fo

47 Nov 01, 2022

[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Contents Local and Global GAN Cross-View Image Translation Semantic Image Synthesis Acknowledgments Related Projects Citation Contributions Collaborat

131 Dec 07, 2022

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

129 Dec 24, 2022

Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020.

RegNet Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020. Paper | Official Implementation RegNet offer a very

2 Feb 11, 2022

Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022

PyTorch EO aims to make Deep Learning for Earth Observation data easy and accessible to real-world cases and research alike.

Pytorch EO Deep Learning for Earth Observation applications and research. 🚧 This project is in early development, so bugs and breaking changes are ex

28 Aug 25, 2022

Discriminative Condition-Aware PLDA

DCA-PLDA This repository implements the Discriminative Condition-Aware Backend described in the paper: L. Ferrer, M. McLaren, and N. Brümmer, "A Speak

31 Aug 05, 2022

Improving Factual Consistency of Abstractive Text Summarization

Improving Factual Consistency of Abstractive Text Summarization We provide the code for the papers: "Entity-level Factual Consistency of Abstractive T

61 Nov 27, 2022

RP-GAN: Stable GAN Training with Random Projections

RP-GAN: Stable GAN Training with Random Projections This repository contains a reference implementation of the algorithm described in the paper: Behna

20 Sep 18, 2021

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

47 Dec 26, 2022

Learning View Priors for Single-view 3D Reconstruction (CVPR 2019)

Learning View Priors for Single-view 3D Reconstruction (CVPR 2019) This is code for a paper Learning View Priors for Single-view 3D Reconstruction by

38 Aug 17, 2022

Advanced yabai wooting scripts

Yabai Wooting scripts Installation requirements Both https://github.com/xiamaz/python-yabai-client and https://github.com/xiamaz/python-wooting-rgb ne

3 Dec 31, 2021

This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

OODformer: Out-Of-Distribution Detection Transformer This repo is the official the implementation of the OODformer: Out-Of-Distribution Detection Tran

34 Dec 02, 2022

TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

326 Dec 13, 2022

OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network

Stock Price Prediction of Apple Inc. Using Recurrent Neural Network OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network Dataset:

410 Jan 05, 2023

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

multilingual-mrc-isdg Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph". This r

5 Dec 07, 2022

Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Learning Category-Specific Mesh Reconstruction from Image Collections Angjoo Kanazawa*, Shubham Tulsiani*, Alexei A. Efros, Jitendra Malik University

438 Dec 22, 2022