An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Overview

Optical_Character_Recognition

An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

As an IOT/Computer Visions Intern at the Graduate Rotational Internship program (GRIP) by The Sparks Foundation (TSF), the first task is to implement a character detector which extracts printed or handwritten text from an image/video.

For more learning purposes, I've utilized this feature in cleaning/extracting valuable information from Blood Pressure Reports as images.

download

Dependencies

  • tesseract-ocr package
  • pytesseract 0.3.8
  • Open-cv
  • Pandas

    Using the pytesseract open source library to detect text on image/video.

    Open-cv for Image Processing

    Pandas for data manipulation

  • Owner
    Ramsis Hammadi
    Ramsis Hammadi
    Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

    Aloception is a set of package for computer vision: aloscene, alodataset, alonet.

    Visual Behavior 86 Dec 28, 2022
    Optical character recognition for Japanese text, with the main focus being Japanese manga

    Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

    Maciej Budyś 327 Jan 01, 2023
    CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

    CUTIE TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohu

    Zhao,Xiaohui 147 Dec 20, 2022
    Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

    hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

    OCRopus 285 Dec 08, 2022
    scene-linear test images

    Scene-Referred Image Collection A collection of OpenEXR Scene-Referred images, encoded as max 2048px width, DWAA 80 compression. All exrs are encoded

    Gralk Klorggson 7 Aug 25, 2022
    Python Computer Vision Aim Bot for Roblox's Phantom Forces

    Python-Phantom-Forces-Aim-Bot Python Computer Vision Aim Bot for Roblox's Phanto

    drag0ngam3s 2 Jul 11, 2022
    Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

    GTA-5-Lane-detection Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and

    Danciu Georgian 4 Aug 01, 2021
    keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

    keras-ctpn [TOC] 说明 预测 训练 例子 4.1 ICDAR2015 4.1.1 带侧边细化 4.1.2 不带带侧边细化 4.1.3 做数据增广-水平翻转 4.2 ICDAR2017 4.3 其它数据集 toDoList 总结 说明 本工程是keras实现的CPTN: Detecti

    mick.yi 107 Jan 09, 2023
    Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

    Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract Toolset U^2-Net is used for background removal Textcleaner is used for image cleaning

    3 Jul 13, 2022
    A selectional auto-encoder approach for document image binarization

    The code of this repository was used for the following publication. If you find this code useful please cite our paper: @article{Gallego2019, title =

    Javier Gallego 89 Nov 18, 2022
    A Joint Video and Image Encoder for End-to-End Retrieval

    Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval (arXiv) Repository to contain the code, models, data for end-to-end

    225 Dec 25, 2022
    Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:

    Multi-Type-TD-TSR Check it out on Source Code of our Paper: Multi-Type-TD-TSR Extracting Tables from Document Images using a Multi-stage Pipeline for

    Pascal Fischer 178 Dec 27, 2022
    Deep Learning Chinese Word Segment

    引用 本项目模型BiLSTM+CRF参考论文:http://www.aclweb.org/anthology/N16-1030 ,IDCNN+CRF参考论文:https://arxiv.org/abs/1702.02098 构建 安装好bazel代码构建工具,安装好tensorflow(目前本项目需

    2.1k Dec 23, 2022
    Recognizing the text contents from a scanned visiting card

    Recognizing the text contents from a scanned visiting card. The application which is used to recognize the text from scanned images,printeddocuments,r

    Faizan Habib 1 Jan 28, 2022
    CNN+Attention+Seq2Seq

    Attention_OCR CNN+Attention+Seq2Seq The model and its tensor transformation are shown in the figure below It is necessary ch_ train and ch_ test the p

    Tsukinousag1 2 Jul 14, 2022
    This repo contains several opencv projects done while learning opencv in python.

    opencv-projects-python This repo contains both several opencv projects done while learning opencv by python and opencv learning resources [Basic conce

    Fatin Shadab 2 Nov 03, 2022
    A machine learning software for extracting information from scholarly documents

    GROBID GROBID documentation Visit the GROBID documentation for more detailed information. Summary GROBID (or Grobid, but not GroBid nor GroBiD) means

    Patrice Lopez 1.9k Jan 08, 2023
    Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

    Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

    Tian Zhi 1.3k Dec 22, 2022
    Ackermann Line Follower Robot Simulation.

    Ackermann Line Follower Robot This is a simulation of a line follower robot that works with steering control based on Stanley: The Robot That Won the

    Lucas Mazzetto 2 Apr 16, 2022
    Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

    Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

    Microsoft 235 Dec 22, 2022