Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Last update: Dec 28, 2022

Overview

CRAFT: Character-Region Awareness For Text detection

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

Overview

PyTorch implementation for CRAFT text detector that effectively detect text area by exploring each character region and affinity between characters. The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

Getting started

Installation

Install using conda for Linux, Mac and Windows (preferred):

conda install -c fcakyon craft-text-detector

Install using pip for Linux and Mac:

pip install craft-text-detector

Basic Usage

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

Advanced Usage

# import craft functions
from craft_text_detector import (
    read_image,
    load_craftnet_model,
    load_refinenet_model,
    get_prediction,
    export_detected_regions,
    export_extra_results,
    empty_cuda_cache
)

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# read image
image = read_image(image_path)

# load models
refine_net = load_refinenet_model(cuda=True)
craft_net = load_craftnet_model(cuda=True)

# perform prediction
prediction_result = get_prediction(
    image=image,
    craft_net=craft_net,
    refine_net=refine_net,
    text_threshold=0.7,
    link_threshold=0.4,
    low_text=0.4,
    cuda=True,
    long_size=1280
)

# export detected text regions
exported_file_paths = export_detected_regions(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    output_dir=output_dir,
    rectify=True
)

# export heatmap, detection points, box visualization
export_extra_results(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    heatmaps=prediction_result["heatmaps"],
    output_dir=output_dir
)

# unload models from gpu
empty_cuda_cache()

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

930 Jan 4, 2023

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

24 Apr 28, 2022

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a

684 Jan 6, 2023

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

224 Jan 7, 2023

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

81 Jan 1, 2023

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

2 Feb 6, 2022

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

38 Dec 5, 2022

Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 6, 2022

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

10 Jun 30, 2021

Comments

Add more options for detect_text method

Hi, sometime I don't want detect_text from file, I want detect_text directly from image in ndarray format, that will save more cost of I/O time. So I contribute this. Thanks for your work

opened by ducviet00 2
Enable package to load model from local path

When using the pypi package it should be allowed to use a model from a local path, because loading it from a remote location removes the control over what model is currently used. And might also result in pull limits being reached.
enhancement

opened by TanjaBayer 1
Fix #8 - Fixing cuda issues in basic usage text detection

Fixing issue #8

In this quick-fix I referenced craft_net as a global variable. If this is not an acceptable workaround, then consider reorganizing the structure of the code.

Have a nice day :)

opened by gaborpelesz 1
accept customized weights path when loading models
path for the weight file can be specified by:

load_craftnet_model(weight_path="path/to/weight")

load_refinenet_model(weight_path="path/to/weight")
opened by fcakyon 0

Releases(0.4.3)

0.4.3(May 9, 2022)
What's Changed

Enable package to load model from local path by @TanjaBayer in https://github.com/fcakyon/craft-text-detector/pull/53

New Contributors

@TanjaBayer made their first contribution in https://github.com/fcakyon/craft-text-detector/pull/53

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.2...0.4.3
Source code(tar.gz)
Source code(zip)
0.4.2(Jan 6, 2022)
What's Changed

fix opencv version by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/48

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.1...0.4.2
Source code(tar.gz)
Source code(zip)
0.4.1(Dec 20, 2021)
What's Changed

fix crop export by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/45

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.0...0.4.1
Source code(tar.gz)
Source code(zip)
0.4.0(Jul 30, 2021)
enhancement

fix boxes outside image boundaries (#37)

breaking changes

drop conda support, update python version (#38)

Source code(tar.gz)
Source code(zip)
0.3.5(May 12, 2021)
Rebuild conda binaries.

Source code(tar.gz)
Source code(zip)

0.3.4(Apr 7, 2021)

add support for PIL and numpy images in addition to filepath. https://github.com/fcakyon/craft-text-detector/pull/28

from PIL import Image
import numpy

# can be filepath, PIL image or numpy array
image = 'figures/idcard.png' 
image = Image.open("figures/idcard.png")
image = numpy.array(Image.open("figures/idcard.png"))

# apply craft text detection
prediction_result = craft.detect_text(image)

Source code(tar.gz)
Source code(zip)

0.3.3(Mar 2, 2021)
Relax requirements for OpenCV (#25)

Source code(tar.gz)
Source code(zip)

0.3.2(Mar 2, 2021)

path for the weight file can be specified by:

load_craftnet_model(weight_path="path/to/weight")

load_refinenet_model(weight_path="path/to/weight")

Source code(tar.gz)
Source code(zip)

v0.3.1(May 14, 2020)
fix empty_cuda_cache

Source code(tar.gz)
Source code(zip)

v0.3.0(May 14, 2020)

updated basic usage for better device handling, now Craft instance should be created before calling detect_text:

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

some internal naming and styling changes

Source code(tar.gz)
Source code(zip)

v0.2.1(May 10, 2020)
fix cuda device bug

fix visualization export bug

Source code(tar.gz)
Source code(zip)
v0.2.0a(Apr 22, 2020)

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 22, 2020)
time profiling

better input size handling (with new long_size parameter)

bug fixes

Source code(tar.gz)
Source code(zip)

Owner

Senior Machine Learning Engineer, METU & Bilkent alum.

GitHub Repository

A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

43 Mar 15, 2022

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

ADAPET This repository contains the official code for the paper: "Improving and Simplifying Pattern Exploiting Training". The model improves and simpl

138 Dec 26, 2022

Simple app for visual editing of Page XML files

Name nw-page-editor - Simple app for visual editing of Page XML files. Version: 2021.02.22 Description nw-page-editor is an application for viewing/ed

27 Jun 20, 2022

This is a GUI program which consist of 4 OpenCV projects

Tkinter-OpenCV Project Using Tkinter, Opencv, Mediapipe This is a python GUI program using Tkinter which consist of 4 OpenCV projects 1. Finger Counte

3 Feb 22, 2022

OCR-D-compliant page segmentation

ocrd_segment This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation. Installation In your virtual e

59 Sep 10, 2022

A toolbox of scene text detection and recognition

FudanOCR This toolbox contains the implementations of the following papers: Scene Text Telescope: Text-Focused Scene Image Super-Resolution [Chen et a

170 Dec 26, 2022

Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

Scene Text Recognition Recommendations Everythin about Scene Text Recognition SOTA • Papers • Datasets • Code Contents 1. Papers 2. Datasets 2.1 Synth

197 Jan 05, 2023

Zoom , GoogleMeets에서 Vtuber 데뷔하기

EasyVtuber Facial landmark와 GAN을 이용한 Character Face Generation Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요! 악세사리는 어느정도 추가해도 잘 작동해요! 안타깝게도 RTX 2070

140 Dec 23, 2022

This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and flexible design and ready to be integrated right into your system!

Passport-Recogniton-System This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and fle

7 Jan 04, 2023

Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

4 Mar 10, 2022

轻量级公式 OCR 小工具：一键识别各类公式图片，并转换为 LaTeX 格式

QC-Formula | 青尘公式 OCR 介绍轻量级开源公式 OCR 小工具：一键识别公式图片，并转换为 LaTeX 格式。支持从电脑本地导入公式图片；（后续版本将支持直接从网页导入图片）公式图片支持 .png / .jpg / .bmp，大小为 4M 以内均可；支持印刷体及手写体，前

26 Jan 07, 2023

基于openpose和图像分类的手语识别项目

手语识别 0、使用到的模型 (1). openpose，作者：CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification，作者：Bubbl

20 Dec 15, 2022

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

0 Oct 30, 2021

Deep Learning Chinese Word Segment

引用本项目模型BiLSTM+CRF参考论文：http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文：https://arxiv.org/abs/1702.02098 构建安装好bazel代码构建工具，安装好tensorflow（目前本项目需

2.1k Dec 23, 2022

An expandable and scalable OCR pipeline

Overview Nidaba is the central controller for the entire OGL OCR pipeline. It oversees and automates the process of converting raw images into citable

81 Jan 04, 2023

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent

45 Dec 06, 2022

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

OCR-Pipeline-with-Keras The keras-ocr package generally consists of two parts: a Detector and a Recognizer: Detector is responsible for creating bound

1 Jan 05, 2022

【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿，我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿，然后我们会帮您搞定一切」如果你觉得这个脚本好用，请点一个 Star ⭐ ，你的 Star 就是作者更新最大的动力点击这里查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image

840 Dec 26, 2022

A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

52 Dec 08, 2022

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Related tags

Overview

CRAFT: Character-Region Awareness For Text detection

Overview

Getting started

Installation

Basic Usage

Advanced Usage

You might also like...

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

This can be use to convert text in a file to handwritten text.

python ocr using tesseract/ with EAST opencv detector

Augmenting Anchors by the Detector Itself

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Comments

Add more options for detect_text method

Enable package to load model from local path

Fix #8 - Fixing cuda issues in basic usage text detection

accept customized weights path when loading models

Releases(0.4.3)

0.4.3(May 9, 2022)

What's Changed

New Contributors

0.4.2(Jan 6, 2022)

What's Changed

0.4.1(Dec 20, 2021)

What's Changed

0.4.0(Jul 30, 2021)

enhancement

breaking changes

0.3.5(May 12, 2021)

0.3.4(Apr 7, 2021)

0.3.3(Mar 2, 2021)

0.3.2(Mar 2, 2021)

v0.3.1(May 14, 2020)

v0.3.0(May 14, 2020)

v0.2.1(May 10, 2020)

v0.2.0a(Apr 22, 2020)

v0.2.0(Apr 22, 2020)

Owner

A curated list of papers and resources for scene text detection and recognition

[EMNLP 2021] Improving and Simplifying Pattern Exploiting Training

Simple app for visual editing of Page XML files

This is a GUI program which consist of 4 OpenCV projects

OCR-D-compliant page segmentation

A toolbox of scene text detection and recognition

Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

Zoom , GoogleMeets에서 Vtuber 데뷔하기

This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and flexible design and ready to be integrated right into your system!

Virtualdragdrop - Virtual Drag and Drop Using OpenCV and Arduino

轻量级公式 OCR 小工具：一键识别各类公式图片，并转换为 LaTeX 格式

基于openpose和图像分类的手语识别项目

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

Deep Learning Chinese Word Segment

An expandable and scalable OCR pipeline

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿，我们会帮你完成一切✨

This is the open source implementation of the ICLR2022 paper "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis"

A real-time dolly zoom camera effect