docstrum

Last update: Dec 13, 2022

Related tags

Computer Vision docstrum

Overview

Docstrum Algorithm

Getting Started

This repo is for developing a Docstrum algorithm presented by O’Gorman (1993).

Disclaimer

This source code is built on top of the work by Chadoliver. Please find the original code from here (https://github.com/chadoliver/cosc428-structor).

Objective

This project aims at segmenting a document image into meaningful components. The domain of image is specified on historical machine-printed/hand-written document image.

Dependencies

python 2.7
Packages:
- numpy
- cv2

Process

Pre-processing Optional for vertical-line removal
- Blurring Bilateral Filtering
- Otsu's thresholding
- Morphological erosion & dilation
- Smoothing (Averaging)
- Static thresholding
Nearest-Neighbor Clustering and Docstrum Plot
Spacing and Orientation Estimation
Determination of Text-lines
Structural Block Determination
Post-processing
- TBD

Evaluation

Citing Docstrum

O'Gorman, L., 1993. The document spectrum for page layout analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), pp.1162-1173. pdf.

@article{o1993document,
  title={The document spectrum for page layout analysis},
  author={O'Gorman, Lawrence},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={15},
  number={11},
  pages={1162--1173},
  year={1993},
  publisher={IEEE}
}

Notes

How to remove .DS_Store

find . -name '.DS_Store' -type f -delete

docstrum

Related tags

Overview

Docstrum Algorithm

Getting Started

Disclaimer

Objective

Dependencies

Process

Evaluation

Citing Docstrum

Notes

How to remove .DS_Store

Owner

Chulwoo Mike Pack

天池2021"全球人工智能技术创新大赛"【赛道一】：医学影像报告异常检测 - 第三名解决方案

Qrcode Attendence System with Opencv and Pyzbar

Using python libraries to track hands

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

A simple document layout analysis using Python-OpenCV

Turn images of tables into CSV data. Detect tables from images and run OCR on the cells.

Indonesian ID Card OCR using tesseract OCR

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Image augmentation library in Python for machine learning.

OCR engine for all the languages

An unofficial package help developers to implement ZATCA (Fatoora) QR code easily which required for e-invoicing

https://arxiv.org/abs/1904.01941

~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.

Lightning Fast Language Prediction 🚀

Detect handwritten words in a text-line (classic image processing method).

code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

TedEval: A Fair Evaluation Metric for Scene Text Detectors