一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

Overview

AgentOCR

GitHub forks GitHub Repo stars Pypi Downloads GitHub release (latest by date including pre-releases) GitHub

Test Build

简介

  • AgentOCR 是一个基于 PaddleOCRONNXRuntime 项目开发的一个使用简单、调用方便的 OCR 项目

  • 本项目目前包含 Python Package 【AgentOCR】 和 OCR 标注软件 【AgentOCRLabeling】

使用指南

  • Python Package:

    • 快速安装:

      # 安装 AgentOCR
      $ pip install agentocr 
      
      # 根据设备平台安装合适版本的 ONNXRuntime
      $ pip install onnxruntime
    • 简单调用:

      # 导入 OCRSystem 模块
      from agentocr import OCRSystem
      
      # 初始化 OCR 模型
      ocr = OCRSystem(config='ch')
      
      # 使用模型对图像进行 OCR 识别
      results = ocr.ocr('test.jpg')
    • 服务器部署:

      • 启动 AgentOCR Server 服务

        $ agentocr server
      • Python 调用

        import cv2
        import json
        import base64
        import requests
        
        # 图片 Base64 编码
        def cv2_to_base64(image):
            data = cv2.imencode('.jpg', image)[1]
            image_base64 = base64.b64encode(data.tobytes()).decode('UTF-8')
            return image_base64
        
        
        # 读取图片
        image = cv2.imread('test.jpg')
        image_base64 = cv2_to_base64(image)
        
        # 构建请求数据
        data = {
            'image': image_base64
        }
        
        # 发送请求
        url = "http://127.0.0.1:5000/ocr"
        r = requests.post(url=url, data=json.dumps(data))
        
        # 打印预测结果
        print(r.json())
    • Jupyter Notebook:【快速使用】

    • 更多安装使用细节请参考:【Package 使用指南】

多语言支持

  • 目前预置了如下语言的配置文件,可通过语言缩写直接调用该配置文件:

    语种 描述 缩写 语种 描述 缩写
    中文 chinese and english ch 保加利亚文 Bulgarian bg
    英文 english en 乌克兰文 Ukranian uk
    法文 french fr 白俄罗斯文 Belarusian be
    德文 german german 泰卢固文 Telugu te
    日文 japan japan 阿巴扎文 Abaza abq
    韩文 korean korean 泰米尔文 Tamil ta
    中文繁体 chinese traditional cht 南非荷兰文 Afrikaans af
    意大利文 Italian it 阿塞拜疆文 Azerbaijani az
    西班牙文 Spanish es 波斯尼亚文 Bosnian bs
    葡萄牙文 Portuguese pt 捷克文 Czech cs
    俄罗斯文 Russia ru 威尔士文 Welsh cy
    阿拉伯文 Arabic ar 丹麦文 Danish da
    印地文 Hindi hi 爱沙尼亚文 Estonian et
    维吾尔 Uyghur ug 爱尔兰文 Irish ga
    波斯文 Persian fa 克罗地亚文 Croatian hr
    乌尔都文 Urdu ur 匈牙利文 Hungarian hu
    塞尔维亚文(latin) Serbian(latin) rs_latin 印尼文 Indonesian id
    欧西坦文 Occitan oc 冰岛文 Icelandic is
    马拉地文 Marathi mr 库尔德文 Kurdish ku
    尼泊尔文 Nepali ne 立陶宛文 Lithuanian lt
    塞尔维亚文(cyrillic) Serbian(cyrillic) rs_cyrillic 拉脱维亚文 Latvian lv
    毛利文 Maori mi 达尔瓦文 Dargwa dar
    马来文 Malay ms 因古什文 Ingush inh
    马耳他文 Maltese mt 拉克文 Lak lbe
    荷兰文 Dutch nl 莱兹甘文 Lezghian lez
    挪威文 Norwegian no 塔巴萨兰文 Tabassaran tab
    波兰文 Polish pl 比尔哈文 Bihari bh
    罗马尼亚文 Romanian ro 迈蒂利文 Maithili mai
    斯洛伐克文 Slovak sk 昂加文 Angika ang
    斯洛文尼亚文 Slovenian sl 孟加拉文 Bhojpuri bho
    阿尔巴尼亚文 Albanian sq 摩揭陀文 Magahi mah
    瑞典文 Swedish sv 那格浦尔文 Nagpur sck
    西瓦希里文 Swahili sw 尼瓦尔文 Newari new
    塔加洛文 Tagalog tl 保加利亚文 Goan Konkani gom
    土耳其文 Turkish tr 沙特阿拉伯文 Saudi Arabia sa
    乌兹别克文 Uzbek uz 阿瓦尔文 Avar ava
    越南文 Vietnamese vi 阿瓦尔文 Avar ava
    蒙古文 Mongolian mn 阿迪赫文 Adyghe ady

预训练模型

  • 检测模型:

    Model Name Model Type Pretrained Model
    ch_ppocr_mobile_v2.0_det det Download
    ch_ppocr_server_v2.0_det det Download
    en_ppocr_mobile_v2.0_det det Download
    en_ppocr_mobile_v2.0_table_det det Download
  • 分类模型:

    Model Name Model Type Pretrained Model
    ch_ppocr_mobile_v2.0_cls cls Download
  • 识别模型:

    Model Name Model Type Pretrained Model
    ch_ppocr_mobile_v2.0_rec rec Download
    ch_ppocr_server_v2.0_rec rec Download
    ka_ppocr_mobile_v2.0_rec rec Download
    te_ppocr_mobile_v2.0_rec rec Download
    ta_ppocr_mobile_v2.0_rec rec Download
    cht_ppocr_mobile_v2.0_rec rec Download
    japan_ppocr_mobile_v2.0_rec rec Download
    latin_ppocr_mobile_v2.0_rec rec Download
    arabic_ppocr_mobile_v2.0_rec rec Download
    korean_ppocr_mobile_v2.0_rec rec Download
    french_ppocr_mobile_v2.0_rec rec Download
    german_ppocr_mobile_v2.0_rec rec Download
    cyrillic_ppocr_mobile_v2.0_rec rec Download
    en_ppocr_mobile_v2.0_table_rec rec Download
    en_ppocr_mobile_v2.0_number_rec rec Download
    devanagari_ppocr_mobile_v2.0_rec rec Download
You might also like...
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.

Smaller Multilingual Transformers This repository shares smaller versions of multilingual transformers that keep the same representations offered by t

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

Code Repo for the ACL21 paper
Code Repo for the ACL21 paper "Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning"

Common Sense Beyond English: Evaluating and Improving Multilingual LMs for Commonsense Reasoning This is the Github repository of our paper, "Common S

Byte-based multilingual transformer TTS for low-resource/few-shot language adaptation.

One model to speak them all 🌎 Audio Language Text ▷ Chinese 人人生而自由,在尊严和权利上一律平等。 ▷ English All human beings are born free and equal in dignity and rig

 Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021 The code for training mCOLT/mRASP2, a multilingua

A Python multilingual toolkit for Sentiment Analysis and Social NLP tasks

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks A Transformer-based library for SocialNLP classification tasks. Currently

A multilingual version of MS MARCO passage ranking dataset

mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t

Comments
  • linux下的预测速度是否与windows存在差异

    linux下的预测速度是否与windows存在差异

    请问大佬有没有测试过在linux运行agentocr与windows下的性能差异,感觉速度差距有点大,条件限制没办法比对完全一样的硬件环境,只是猜测是不是和系统有关?

    1. 以下环境都是基于python3.7.10,agentocr 1.2.0,预测同一张图片
    2. 本地环境是笔记本电脑,win10,CPU是AMD Ryzen 7 5800H,8核16线程,预测得到结果耗时是2.5秒以内
    3. 一台Linux服务器,Centos7,是由Intel(R) Xeon(R) CPU E5-2680 v4划出来的4核虚拟机,预测得到结果耗时是在7.2秒以内
    4. 由上面笔记本电脑运行的VirtualBOX划分了4个CPU(VB上面显示有16个CPU,猜测应该是划分了4个核心线程出来)的虚拟机,Centos7,预测得到结果耗时也和第二点的linux服务器接近
    5. 一台windows服务器,winserver2012,是由Intel i7-8700划出来的2核虚拟机,预测得到结果耗时是在4.7秒以内

    image image

    从任务运行情况来看,windows环境下在任务管理器可以看出,预测过程中所有核心都是参与工作的 而linux环境通过top命令能看出CPU占用最高只能到200%,理论上4核心应该能到400%,是不是所有核心没有参与工作导致预测速度比较慢?条件有限,笔记本的CPU和台式服务器的CPU也没有直接的性能比较可以参考,但即便是比较旧的服务器CPU也不会跟7nm的AMD笔记本CPU有这么大差距吧,如果有大佬们测试过或者知道原因希望能告知一下!!

    opened by w-Bro 10
  • 【PaddlePaddle Hackathon】100 制作 Rubick 深度学习相关小插件

    【PaddlePaddle Hackathon】100 制作 Rubick 深度学习相关小插件

    (此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE,更多详见PaddlePaddle Hackathon

    【任务说明】

    • 任务标题:制作 Rubick 深度学习相关小插件

    • 难度:中等(通过验收即可获得5000RMB)

    • 技术标签:JavaScript、PaddlePaddle

    • 详细描述:随着 Rubick、Utools 等高质量桌面效能工具箱的出现,使用深度学习进行赋能将会为其带来更多有趣的玩法。在本任务中,您可以借助 AgentOCR 或其他飞桨相关深度学习工具,结合 Paddle.JS 或 ONNX.JS 将深度学习模型以 Rubick 插件形式进行部署,例如使用 AgentOCR 的 OCR 能力让 Rubick 的截图拥有文字识别能力,当然你也可以选择自己喜欢的模型为 Rubick 进行赋能,只要以 Rubick 的插件形式进行开发即可视为有效提交。

    Paddle.JS 主页:https://github.com/PaddlePaddle/Paddle.js

    AgentOCR 主页:https://github.com/AgentMaker/AgentOCR

    【提交内容】

    • 项目 PR 到 AgentOCR

    • 技术说明文档

    【技术要求】

    • 具备的 JavaScript 开发能力
    PaddlePaddle Hackathon 
    opened by GT-ZhangAcer 0
  • 【PaddlePaddle Hackathon】99 为 AgentOCR 工具适配 JavaScript 环境

    【PaddlePaddle Hackathon】99 为 AgentOCR 工具适配 JavaScript 环境

    (此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE,更多详见PaddlePaddle Hackathon

    【任务说明】

    • 任务标题:为 AgentOCR 工具适配 JavaScript 环境

    • 技术标签:JavaScript

    • 任务难度:简单

    • 详细描述:在 Web 前端以及、移动端 APP 开发甚至是桌面应用开发中, JavaScript 所体现的强大兼容性使得跨平台应用更加便捷。目前 AgentOCR 提供了飞桨 PaddlePaddle、ONNX、DML 三种后端支持,为更方便让基于 PaddleOCR 的 AgentOCR 更好适配更多开发者所需环境,我们可以通过不限于 Paddle.JS、ONNX.JS 中任一方式使得其支持JavaScript的OCR推理功能。本这个项目中,你需要在精度损失和速度损失较低的情况下制作 Paddle.JS 或 ONNX.JS 版本的 AgentOCR 开发程序包。

    Paddle.JS 主页:https://github.com/PaddlePaddle/Paddle.js

    AgentOCR 主页:https://github.com/AgentMaker/AgentOCR

    【提交内容】

    • 项目 PR 到 AgentOCR

    • 技术说明文档

    【技术要求】

    • 具备的 JavaScript 开发能力
    PaddlePaddle Hackathon 
    opened by GT-ZhangAcer 0
Releases(2.0.0)
Owner
AgentMaker
Focus on deep learning tools
AgentMaker
Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021)

Style-based Point Generator with Adversarial Rendering for Point Cloud Completion (CVPR 2021) An efficient PyTorch library for Point Cloud Completion.

Microsoft 119 Jan 02, 2023
A Pytorch implement of paper "Anomaly detection in dynamic graphs via transformer" (TADDY).

TADDY: Anomaly detection in dynamic graphs via transformer This repo covers an reference implementation for the paper "Anomaly detection in dynamic gr

Yue Tan 21 Nov 24, 2022
Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Torch-template-for-deep-learning Pytorch implementations of some **classical backbone CNNs, data enhancement, torch loss, attention, visualization and

Li Shengyan 270 Dec 31, 2022
Bling's Object detection tool

BriVL for Building Applications This repo is used for illustrating how to build applications by using BriVL model. This repo is re-implemented from fo

chuhaojin 47 Nov 01, 2022
[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Contents Local and Global GAN Cross-View Image Translation Semantic Image Synthesis Acknowledgments Related Projects Citation Contributions Collaborat

Hao Tang 131 Dec 07, 2022
Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022) By Shilong Zhang*, Zhuoran Yu*, Liyang Liu*, Xinjiang Wang, Aojun Zhou,

Shilong Zhang 129 Dec 24, 2022
Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020.

RegNet Pytorch Implementation of "Desigining Network Design Spaces", Radosavovic et al. CVPR 2020. Paper | Official Implementation RegNet offer a very

Vishal R 2 Feb 11, 2022
Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022
PyTorch EO aims to make Deep Learning for Earth Observation data easy and accessible to real-world cases and research alike.

Pytorch EO Deep Learning for Earth Observation applications and research. 🚧 This project is in early development, so bugs and breaking changes are ex

earthpulse 28 Aug 25, 2022
Discriminative Condition-Aware PLDA

DCA-PLDA This repository implements the Discriminative Condition-Aware Backend described in the paper: L. Ferrer, M. McLaren, and N. Brümmer, "A Speak

Luciana Ferrer 31 Aug 05, 2022
Improving Factual Consistency of Abstractive Text Summarization

Improving Factual Consistency of Abstractive Text Summarization We provide the code for the papers: "Entity-level Factual Consistency of Abstractive T

61 Nov 27, 2022
RP-GAN: Stable GAN Training with Random Projections

RP-GAN: Stable GAN Training with Random Projections This repository contains a reference implementation of the algorithm described in the paper: Behna

Ayan Chakrabarti 20 Sep 18, 2021
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

weijiawu 47 Dec 26, 2022
Learning View Priors for Single-view 3D Reconstruction (CVPR 2019)

Learning View Priors for Single-view 3D Reconstruction (CVPR 2019) This is code for a paper Learning View Priors for Single-view 3D Reconstruction by

Hiroharu Kato 38 Aug 17, 2022
Advanced yabai wooting scripts

Yabai Wooting scripts Installation requirements Both https://github.com/xiamaz/python-yabai-client and https://github.com/xiamaz/python-wooting-rgb ne

Max Zhao 3 Dec 31, 2021
This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

OODformer: Out-Of-Distribution Detection Transformer This repo is the official the implementation of the OODformer: Out-Of-Distribution Detection Tran

34 Dec 02, 2022
TDN: Temporal Difference Networks for Efficient Action Recognition

TDN: Temporal Difference Networks for Efficient Action Recognition Overview We release the PyTorch code of the TDN(Temporal Difference Networks).

Multimedia Computing Group, Nanjing University 326 Dec 13, 2022
OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network

Stock Price Prediction of Apple Inc. Using Recurrent Neural Network OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network Dataset:

Nouroz Rahman 410 Jan 05, 2023
Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

multilingual-mrc-isdg Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph". This r

Liyan 5 Dec 07, 2022
Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Learning Category-Specific Mesh Reconstruction from Image Collections Angjoo Kanazawa*, Shubham Tulsiani*, Alexei A. Efros, Jitendra Malik University

438 Dec 22, 2022