Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Overview

CRAFT: Character-Region Awareness For Text detection

Downloads PyPI version Conda version CI

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

Overview

PyTorch implementation for CRAFT text detector that effectively detect text area by exploring each character region and affinity between characters. The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

teaser

Getting started

Installation

  • Install using conda for Linux, Mac and Windows (preferred):
conda install -c fcakyon craft-text-detector
  • Install using pip for Linux and Mac:
pip install craft-text-detector

Basic Usage

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

Advanced Usage

# import craft functions
from craft_text_detector import (
    read_image,
    load_craftnet_model,
    load_refinenet_model,
    get_prediction,
    export_detected_regions,
    export_extra_results,
    empty_cuda_cache
)

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# read image
image = read_image(image_path)

# load models
refine_net = load_refinenet_model(cuda=True)
craft_net = load_craftnet_model(cuda=True)

# perform prediction
prediction_result = get_prediction(
    image=image,
    craft_net=craft_net,
    refine_net=refine_net,
    text_threshold=0.7,
    link_threshold=0.4,
    low_text=0.4,
    cuda=True,
    long_size=1280
)

# export detected text regions
exported_file_paths = export_detected_regions(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    output_dir=output_dir,
    rectify=True
)

# export heatmap, detection points, box visualization
export_extra_results(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    heatmaps=prediction_result["heatmaps"],
    output_dir=output_dir
)

# unload models from gpu
empty_cuda_cache()
You might also like...
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.
Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

Augmenting Anchors by the Detector Itself
Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

Comments
  • Add more options for detect_text method

    Add more options for detect_text method

    Hi, sometime I don't want detect_text from file, I want detect_text directly from image in ndarray format, that will save more cost of I/O time. So I contribute this. Thanks for your work

    opened by ducviet00 2
  • Enable package to load model from local path

    Enable package to load model from local path

    When using the pypi package it should be allowed to use a model from a local path, because loading it from a remote location removes the control over what model is currently used. And might also result in pull limits being reached.

    enhancement 
    opened by TanjaBayer 1
  • Fix #8 - Fixing cuda issues in basic usage text detection

    Fix #8 - Fixing cuda issues in basic usage text detection

    Fixing issue #8

    In this quick-fix I referenced craft_net as a global variable. If this is not an acceptable workaround, then consider reorganizing the structure of the code.

    Have a nice day :)

    opened by gaborpelesz 1
  • accept customized weights path when loading models

    accept customized weights path when loading models

    path for the weight file can be specified by:

    load_craftnet_model(weight_path="path/to/weight")
    
    load_refinenet_model(weight_path="path/to/weight")
    
    opened by fcakyon 0
Releases(0.4.3)
  • 0.4.3(May 9, 2022)

    What's Changed

    • Enable package to load model from local path by @TanjaBayer in https://github.com/fcakyon/craft-text-detector/pull/53

    New Contributors

    • @TanjaBayer made their first contribution in https://github.com/fcakyon/craft-text-detector/pull/53

    Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.2...0.4.3

    Source code(tar.gz)
    Source code(zip)
  • 0.4.2(Jan 6, 2022)

    What's Changed

    • fix opencv version by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/48

    Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.1...0.4.2

    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Dec 20, 2021)

    What's Changed

    • fix crop export by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/45

    Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.0...0.4.1

    Source code(tar.gz)
    Source code(zip)
  • 0.4.0(Jul 30, 2021)

  • 0.3.5(May 12, 2021)

  • 0.3.4(Apr 7, 2021)

    • add support for PIL and numpy images in addition to filepath. https://github.com/fcakyon/craft-text-detector/pull/28
    from PIL import Image
    import numpy
    
    # can be filepath, PIL image or numpy array
    image = 'figures/idcard.png' 
    image = Image.open("figures/idcard.png")
    image = numpy.array(Image.open("figures/idcard.png"))
    
    # apply craft text detection
    prediction_result = craft.detect_text(image)
    Source code(tar.gz)
    Source code(zip)
  • 0.3.3(Mar 2, 2021)

  • 0.3.2(Mar 2, 2021)

    path for the weight file can be specified by:

    load_craftnet_model(weight_path="path/to/weight")
    
    load_refinenet_model(weight_path="path/to/weight")
    
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(May 14, 2020)

    • updated basic usage for better device handling, now Craft instance should be created before calling detect_text:
    # import Craft class
    from craft_text_detector import Craft
    
    # set image path and export folder directory
    image_path = 'figures/idcard.png'
    output_dir = 'outputs/'
    
    # create a craft instance
    craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)
    
    # apply craft text detection and export detected regions to output directory
    prediction_result = craft.detect_text(image_path)
    
    # unload models from ram/gpu
    craft.unload_craftnet_model()
    craft.unload_refinenet_model()
    
    • some internal naming and styling changes
    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(May 10, 2020)

  • v0.2.0a(Apr 22, 2020)

  • v0.2.0(Apr 22, 2020)

Owner
Senior Machine Learning Engineer, METU & Bilkent alum.
Ddddocr - 通用验证码识别OCR pypi版

带带弟弟OCR通用验证码识别SDK免费开源版 今天ddddocr又更新啦! 当前版本为1.3.1 想必很多做验证码的新手,一定头疼碰到点选类型的图像,做样本费时

Sml2h3 4.4k Dec 31, 2022
📷 Face Recognition using Haar-Cascade Classifier, OpenCV, and Python

Face-Recognition-System Face Recognition using Haar-Cascade Classifier, OpenCV and Python. This project is based on face detection and face recognitio

1 Jan 10, 2022
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

CAI Junhao 7 Oct 03, 2022
TableBank: A Benchmark Dataset for Table Detection and Recognition

TableBank TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on th

844 Jan 04, 2023
OpenMMLab Text Detection, Recognition and Understanding Toolbox

Introduction English | 简体中文 MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the correspondi

OpenMMLab 3k Jan 07, 2023
Program created with opencv that allows you to automatically count your repetitions on several fitness exercises.

Virtual partner of gym Description Program created with opencv that allows you to automatically count your repetitions on several fitness exercises li

1 Jan 04, 2022
Automatically resolve RidderMaster based on TensorFlow & OpenCV

AutoRiddleMaster Automatically resolve RidderMaster based on TensorFlow & OpenCV 基于 TensorFlow 和 OpenCV 实现的全自动化解御迷士小马谜题 Demo How to use Deploy the ser

神龙章轩 5 Nov 19, 2021
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
Text Detection from images using OpenCV

EAST Detector for Text Detection OpenCV’s EAST(Efficient and Accurate Scene Text Detection ) text detector is a deep learning model, based on a novel

Abhishek Singh 88 Oct 20, 2022
Basic functions manipulating images using the OpenCV library

OpenCV Basic functions manipulating images using the OpenCV library. Reading Ima

Shatha Siala 3 Feb 17, 2022
A dataset handling library for computer vision datasets in LOST-fromat

A dataset handling library for computer vision datasets in LOST-fromat

8 Dec 15, 2022
Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

Fusformer Code for the paper: "Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution" Plateform Python 3.8.5 + Pytor

Jin-Fan Hu (胡锦帆) 11 Dec 12, 2022
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 04, 2023
A tool to make dumpy among us GIFS

Among Us Dumpy Gif Maker Made by ThatOneCalculator & Pixer415 With help from Telk, karl-police, and auguwu! Please credit this repository when you use

Kainoa Kanter 535 Jan 07, 2023
This repository summarized computer vision theories.

This repository summarized computer vision theories.

3 Feb 04, 2022
This repo contains several opencv projects done while learning opencv in python.

opencv-projects-python This repo contains both several opencv projects done while learning opencv by python and opencv learning resources [Basic conce

Fatin Shadab 2 Nov 03, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 08, 2022
Simple app for visual editing of Page XML files

Name nw-page-editor - Simple app for visual editing of Page XML files. Version: 2021.02.22 Description nw-page-editor is an application for viewing/ed

Mauricio Villegas 27 Jun 20, 2022