Binarize document images

Overview

Binarization

Binarization for document images

Examples

Introduction

This tool performs document image binarization (i.e. transform colour/grayscale to black-and-white pixels) for OCR using multiple trained models.

The method used is based on Calvo-Zaragoza/Gallego, 2018. A selectional auto-encoder approach for document image binarization.

Installation

Clone the repository, enter it and run

pip install .

Models

Pre-trained models can be downloaded from here:

https://qurator-data.de/sbb_binarization/

Usage

sbb_binarize \
  --patches \
  -m <directory with models> \
  <input image> \
  <output image>

Note In virtually all cases, the --patches flag will improve results.

To use the OCR-D interface:

ocrd-sbb-binarize --overwrite -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model "/var/lib/sbb_binarization"
Comments
  • Handle input errors in exceptions

    Handle input errors in exceptions

    Hello I trying to use image input "sbb_binarize --patches -m ./models/model_bin_sbb_ens.h5 179681.png img.png"

    I get the following error " File "/home/lin/anaconda3/bin/sbb_binarize", line 8, in sys.exit(main()) File "/home/lin/anaconda3/lib/python3.5/site-packages/click/core.py", line 829, in call return self.main(*args, **kwargs) File "/home/lin/anaconda3/lib/python3.5/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/home/lin/anaconda3/lib/python3.5/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/lin/anaconda3/lib/python3.5/site-packages/click/core.py", line 610, in invoke return callback(*args, **kwargs) File "/home/lin/anaconda3/lib/python3.5/site-packages/sbb_binarize/cli.py", line 16, in main SbbBinarizer(model_dir).run(image_path=input_image, use_patches=patches, save=output_image) File "/home/lin/anaconda3/lib/python3.5/site-packages/sbb_binarize/sbb_binarize.py", line 265, in run img_last[:, :][img_last[:, :] > 0] = 255 TypeError: 'int' object is not subscriptable"

    documentation 
    opened by hiyashi-CianDuo 21
  • how to use sbb_binarization within a script?

    how to use sbb_binarization within a script?

    Hi. After searching for numerous hours without success, I am wondering if someone might offer insight on how to run this from within a python script.

    (Using Windows 10 os, Visual Studio Code) For example, I can run the following successfully from the terminal: sbb_binarize --patches -m 'C:/Users/Scott/Desktop/Python2/sbb_binarization/models' 'C:/Users/Scott/Desktop/Python2/Kpics/Pages_cropped/061r.png' 'C:/Users/Scott/Desktop/Python2/Kpics/new_test8.png' However, if I try the following script (using CodeRunner extension):

    import subprocess
    def sbb_def():
        args = ['sbb_binarize', '--patches', '-m', 'C:/Users/Scott/Desktop/Python2/sbb_binarization/models', 'C:/Users/Scott/Desktop/Python2/Kpics/Pages_cropped/061r.png', 'C:/Users/Scott/Desktop/Python2/Kpics/new_test8.png']
        subprocess.Popen(args)
    sbb_def()
    

    I get the following:

    [Running] C:\ProgramData\Anaconda3\Scripts\activate.bat C:\ProgramData\Anaconda3 & python "c:\Users\Scott\Desktop\Python2\my_sbb_binarization_example.py"
    Traceback (most recent call last):
      File "c:\Users\Scott\Desktop\Python2\my_sbb_binarization_example.py", line 8, in <module>
        sbb_def()
      File "c:\Users\Scott\Desktop\Python2\my_sbb_binarization_example.py", line 6, in sbb_def
        subprocess.Popen(args)
      File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 854, in __init__
        self._execute_child(args, executable, preexec_fn, close_fds,
      File "C:\ProgramData\Anaconda3\lib\subprocess.py", line 1307, in _execute_child
        hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
    FileNotFoundError: [WinError 2] The system cannot find the file specified
    
    [Done] exited with code=1 in 0.687 seconds
    

    I don't suggest that this is a bug or anything. I'm rather sure the "issue" is mine. I'm very green at python/coding in general. Any help would be greatly appreciated.

    opened by SB2020-eye 11
  • Cannot install sbb_binarization (on Windows) - TensorFlow not found (even, if available)

    Cannot install sbb_binarization (on Windows) - TensorFlow not found (even, if available)

    Hi, I am trying out to setup your nice tool in Windows environment. I am using Python 3.8. After doing "pip install sbb_binarization" I get the following error:

    Collecting ocrd>=2.18.0
      Using cached ocrd-2.20.1-py3-none-any.whl (51 kB)
    ERROR: Could not find a version that satisfies the requirement tensorflow<1.16,>=1.15 (from sbb_binarization) (from versions: 2.2.0rc1, 2.2.0rc2, 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.2.1, 2.3.0rc0, 2.3.0rc1, 2.3.0rc2, 2.3.0, 2.3.1, 2.4.0rc0, 2.4.0rc1)
    ERROR: No matching distribution found for tensorflow<1.16,>=1.15 (from sbb_binarization)
    

    If i call "pip list", I can see, that TensorFlow is installed:

    ...
    setuptools             41.2.0
    six                    1.14.0
    stomp.py               6.0.0
    tensorboard            2.4.0
    tensorboard-plugin-wit 1.7.0
    tensorflow             2.3.1
    tensorflow-estimator   2.3.0
    termcolor              1.1.0
    urllib3                1.26.2
    ...
    Do you have any idea, what to do?
    
    wontfix 
    opened by stefanCCS 11
  • Model won't load on Python 3.9

    Model won't load on Python 3.9

    Hey,

    After using this model for a while and having quite remarkable results as compared to standard binarization techniques, I would like to move to a newer version of python: 3.9.

    Unfortunately, the model won't load then as I get a ValueError: bad marshal data (unknown type code). To fix this I need the raw SBB model and load the weights there and save again in the newer python version.

    Is anyone aware of what the exact model is or where I can find it?

    Thanks! LudovA

    opened by LudovA 7
  • output is inverted in certain input formats

    output is inverted in certain input formats

    I sometimes get output which looks like this:

    OCR-D-SEG-PAGE-BIN_Ratsbuecher_O_10_0237 IMG-BIN

    The input image for this was a PNG (which someone seems to have converted somehow from an original JPEG):

    Ratsbuecher_O_10_0237

    (That's from this GT BTW.)

    bug 
    opened by bertsky 6
  • can't install

    can't install

    Hi. Running on Windows 10 OS. Using Visual Studio Code.

    Running (myenvironmentname) PS C:\users\scott\desktop\python2\sbb_binarization> pip install . I keep getting the following:

    Processing c:\users\scott\desktop\python2\sbb_binarization
        ERROR: Command errored out with exit status 1:
         command: 'C:\ProgramData\Anaconda3\envs\myenvironmentname\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Scott\\AppData\\Local\\Temp\\pip-req-build-l7egxsl1\\setup.py'"'"'; __file__='"'"'C:\\Users\\Scott\\AppData\\Local\\Temp\\pip-req-build-l7egxsl1\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\Scott\AppData\Local\Temp\pip-pip-egg-info-notlemz5'
             cwd: C:\Users\Scott\AppData\Local\Temp\pip-req-build-l7egxsl1\
        Complete output (5 lines):
        Traceback (most recent call last):
          File "<string>", line 1, in <module>
          File "C:\Users\Scott\AppData\Local\Temp\pip-req-build-l7egxsl1\setup.py", line 6, in <module>
            with open('./ocrd-tool.json', 'r') as f:
        FileNotFoundError: [Errno 2] No such file or directory: './ocrd-tool.json'
        ----------------------------------------
    WARNING: Discarding file:///C:/users/scott/desktop/python2/sbb_binarization. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
    ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
    

    I'm very new at all this. And my (beginner) language is Python. I don't understand the json stuff. Any help getting this installed would be greatly appreciated. :)

    opened by SB2020-eye 6
  • [transformer_model_integration]

    [transformer_model_integration] "Normal" CLI does not produce useful output

    On the transformer_model_integration branch, the normal CLI does not produce useful output

    • I'm using the image from https://qurator-data.de/examples/actevedef_718448162.first-page.zip
    • This fails with a (transparent?) empty output TIFF: sbb_binarize --patches --model-dir ~/devel/qurator-data/sbb_binarization/2022-08-16/ OCR-D-IMG_00000024.tif OCR-D-IMG_00000024-bin.tif
    • This - the OCR-D CLI - works(!): ocrd-sbb-binarize -I OCR-D-IMG -O OCR-D-IMG-BIN -P model /home/mike/devel/qurator-data/sbb_binarization/2022-08-16
    bug 
    opened by mikegerber 4
  • strange border artifacts in patch mode

    strange border artifacts in patch mode

    I sometimes get output which looks like this:

    | input | output | | --- | --- | | Evakuierung_von_Polen_Ansieldung_WD_Lask_0168 IMG-NORM | Evakuierung_von_Polen_Ansieldung_WD_Lask_0168 IMG-BINSBB |

    Could this be a problem with the patch size or patching in general? Should I try to crop first?

    opened by bertsky 3
  • v0.0.7 is not on PyPi

    v0.0.7 is not on PyPi

    This just came up in our team meeting: Version 0.0.7 is not on PyPI.

    (Commit history also looks like a bug fix is not in the most recent GitHub release yet but I cannot say if that bug fix warrants a new release or not.)

    opened by mikegerber 3
  • Why is --patches not the default?

    Why is --patches not the default?

    The README says:

    Note In virtually all cases, applying the --patches flag will improve the quality of results.

    Why is it not the default? Why no --no-patches option instead?

    documentation 
    opened by mikegerber 2
  • Cannot load models in qurator-data git-annex

    Cannot load models in qurator-data git-annex

    $ ocrd-sbb-binarize --overwrite -I OCR-D-IMG -O OCR-D-IMG-BIN -P model /var/lib/sbb_binarization
    18:35:13.783 INFO processor.SbbBinarize - INPUT FILE 0 / PHYS_0024
    18:35:13.787 INFO processor.SbbBinarize - Binarizing on 'page' level in page 'PHYS_0024'
    /var/lib/sbb_binarization/.gitkeep
    Traceback (most recent call last):
      File "/usr/local/bin/ocrd-sbb-binarize", line 8, in <module>
        sys.exit(cli())
      File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
        return self.main(*args, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
        rv = self.invoke(ctx)
      File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
        return callback(*args, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/sbb_binarize/ocrd_cli.py", line 115, in cli
        return ocrd_cli_wrap_processor(SbbBinarizeProcessor, *args, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/ocrd/decorators/__init__.py", line 81, in ocrd_cli_wrap_processor
        run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/ocrd/processor/helpers.py", line 69, in run_processor
        processor.process()
      File "/usr/local/lib/python3.6/dist-packages/sbb_binarize/ocrd_cli.py", line 66, in process
        bin_image = cv2pil(binarizer.run(image=pil2cv(page_image), use_patches=True))
      File "/usr/local/lib/python3.6/dist-packages/sbb_binarize/sbb_binarize.py", line 199, in run
        res = self.predict(model_in, image, use_patches)
      File "/usr/local/lib/python3.6/dist-packages/sbb_binarize/sbb_binarize.py", line 47, in predict
        model, model_height, model_width, n_classes = self.load_model(model_name)
      File "/usr/local/lib/python3.6/dist-packages/sbb_binarize/sbb_binarize.py", line 40, in load_model
        model = load_model(join(self.model_dir, model_name), compile=False)
      File "/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py", line 492, in load_wrapper
        return load_function(*args, **kwargs)
      File "/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py", line 583, in load_model
        with H5Dict(filepath, mode='r') as h5dict:
      File "/usr/local/lib/python3.6/dist-packages/keras/utils/io_utils.py", line 191, in __init__
        self.data = h5py.File(path, mode=mode)
      File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 408, in __init__
        swmr=swmr)
      File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 173, in make_fid
        fid = h5f.open(name, flags, fapl=fapl)
      File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
      File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
      File "h5py/h5f.pyx", line 88, in h5py.h5f.open
    OSError: Unable to open file (file signature not found)
    

    The directory /var/lib/sbb_binarization is a copy of sbb_binarization/ in our private qurator-data git-annex, which happens to include a file .gitkeep - which the current code tries to load as a HDF5 file.

    opened by mikegerber 2
  • packaging inconsistency

    packaging inconsistency

    If I run sbb_binarize --version, I get:

    Traceback (most recent call last):
      File "/bin/sbb_binarize", line 8, in <module>
        sys.exit(main())
      File "/lib/python3.6/site-packages/click/core.py", line 1128, in __call__
        return self.main(*args, **kwargs)
      File "/lib/python3.6/site-packages/click/core.py", line 1052, in main
        with self.make_context(prog_name, args, **extra) as ctx:
      File "/lib/python3.6/site-packages/click/core.py", line 914, in make_context
        self.parse_args(ctx, args)
      File "/lib/python3.6/site-packages/click/core.py", line 1370, in parse_args
        value, args = param.handle_parse_result(ctx, opts, args)
      File "/lib/python3.6/site-packages/click/core.py", line 2347, in handle_parse_result
        value = self.process_value(ctx, value)
      File "/lib/python3.6/site-packages/click/core.py", line 2309, in process_value
        value = self.callback(ctx, self, value)
      File "/lib/python3.6/site-packages/click/decorators.py", line 383, in callback
        ) from None
    RuntimeError: 'sbb_binarize' is not installed. Try passing 'package_name' instead.
    

    Looks like the name=sbb_binarization kwarg is not consistent with the top-level module sbb_binarize IINM.

    Maybe you want to restructure your package using qurator as namespace package on that occasion?

    opened by bertsky 1
  • Batch-prediction across multiple GPUs and more efficient patch-prediction

    Batch-prediction across multiple GPUs and more efficient patch-prediction

    In order to batch-binarize thousands of images, I've rewritten the prediction script to allow us to predict around 1500-2000 images per hour on a decent machine with two GPUs.

    The proposed changes include:

    • An efficient way to compute the image patches instead of a very inefficient loop
    • Complete removal of the prediction on the down-scaled image as the results are pretty much always worse
    • Batch-prediction code that can binarize an entire directory into a given output directory while preserving the folder structure and skipping images that have already been binarized, to allow stopping and continuing the conversion
    • Multiprocessing batch-prediction across multiple GPUs using the mpire library
    • A fix for the memory-leak that caused mass-binarization to very quickly crash because we were running out of memory on the GPU. With this fix, we are already running the conversion for 16 hours without any crash.
    • Simplified loading of the model removing obsolete session-handling code

    Please note: I know that the code looks completely different now (hopefully more readable) and is probably not 1:1 compatible with the remaining code in your repository, but I tried to put all the relevant changes into this PR and make the code as self-contained as possible to allow you to update the solution as you see fit.

    Thanks for sharing the code-base with us. I hope that this PR is of some help to you.

    opened by apacha 3
  • Saving to TIFF does not work

    Saving to TIFF does not work

    E.g.

    % sbb_binarize --patches --model-dir /home/mike/devel/qurator-data/sbb_binarization/2022-08-16/ OCR-D-IMG_00000024.tif OCR-D-IMG_00000024-bin.tif
    

    produces a transparent(?) TIFF with no content. No warning, no error.

    See also #46.

    bug 
    opened by mikegerber 3
  • Pinning library versions and adding Python version to README

    Pinning library versions and adding Python version to README

    This PR updates the requirements.txt (to pin specifically tested versions that actually work) as well as adding a tested Python version information to the README. Later versions will not work because of loading serialized h5 models causes marshalling errors.

    Also adding .idea directory to list of ignored directories.

    Fixes #39

    opened by apacha 4
  • Document supported Python versions

    Document supported Python versions

    sbb_binarization currently needs TensorFlow 2.4, which is not available* for Python 3.10, the default on my Linux installation. Which versions are supported?

    • as in: available on PyPI:
    ERROR: Could not find a version that satisfies the requirement tensorflow==2.4.* (from sbb-binarization) (from versions: 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1)
    ERROR: No matching distribution found for tensorflow==2.4.*
    
    documentation 
    opened by mikegerber 6
Releases(v0.0.11)
Owner
QURATOR-SPK
Curation Technologies
QURATOR-SPK
A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Scene Text Localization & Recognition Resources Read this institute-wise: English, 简体中文. Read this year-wise: English, 简体中文. Tags: [STL] (Scene Text L

Karl Lok (Zhaokai Luo) 901 Dec 11, 2022
CNN+LSTM+CTC based OCR implemented using tensorflow.

CNN_LSTM_CTC_Tensorflow CNN+LSTM+CTC based OCR(Optical Character Recognition) implemented using tensorflow. Note: there is No restriction on the numbe

Watson Yang 356 Dec 08, 2022
Ddddocr - 通用验证码识别OCR pypi版

带带弟弟OCR通用验证码识别SDK免费开源版 今天ddddocr又更新啦! 当前版本为1.3.1 想必很多做验证码的新手,一定头疼碰到点选类型的图像,做样本费时

Sml2h3 4.4k Dec 31, 2022
Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

basic-dataset-generator-from-image-of-numbers generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv inpu

1 Jan 01, 2022
QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.

Application-Oriented Performance Benchmarks for Quantum Computing This repository contains a collection of prototypical application- or algorithm-cent

SRI International 67 Nov 30, 2022
chineseocr/table_line 表格线检测模型pytorch版

table_line_pytorch chineseocr/table_detct 表格线检测模型table_line pytorch版 原项目github: https://github.com/chineseocr/table-detect 1、模型转换 下载原项目table_detect模型文

1 Oct 21, 2021
OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

OpenCV-CameraCalibration-Example FishEyeCameraCalibration.mp4 OpenCVを用いたカメラキャリブレーションのサンプルです 2021/06/21時点でPython実装のある以下3種類について用意しています。 通常カメラ向け 魚眼レンズ向け(

KazuhitoTakahashi 34 Nov 17, 2022
Captcha Recognition

The objective of this project is to recognize the target numbers in the captcha images correctly which would tell us how good or bad a captcha system has been built.

Mohit Kaushik 5 Feb 20, 2022
Handwritten Character Recognition using CNN

Handwritten Character Recognition using CNN Problem Definition The main objective of this project is to solve the problem of handwritten character rec

Mohit Kaushik 4 Mar 02, 2022

Installations for running keras-theano on GPU Upgrade pip and install opencv2 cd ~ pip install --upgrade pip pip install opencv-python Upgrade keras

Berat Kurar Barakat 14 Sep 30, 2022
Creating of virtual elements of the graphical interface using opencv and mediapipe.

Virtual GUI Creating of virtual elements of the graphical interface using opencv and mediapipe. Element GUI Output Description Button By default the b

Aleksei 4 Jun 16, 2022
Textboxes : Image Text Detection Model : python package (tensorflow)

shinTB Abstract A python package for use Textboxes : Image Text Detection Model implemented by tensorflow, cv2 Textboxes Paper Review in Korean (My Bl

Jayne Shin (신재인) 91 Dec 15, 2022
Localization of thoracic abnormalities model based on VinBigData (top 1%)

Repository contains the code for 2nd place solution of VinBigData Chest X-ray Abnormalities Detection competition. The goal of competition was to auto

33 May 24, 2022
Controlling Volume by Hand Gestures

This program allows the user to control the volume of their device with specific hand gestures involving their thumb and index finger!

Riddhi Bajaj 1 Nov 11, 2021
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

MTLFace This repository contains the PyTorch implementation and the dataset of the paper: When Age-Invariant Face Recognition Meets Face Age Synthesis

Hzzone 120 Jan 05, 2023
Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks? Artifact Detection/Correction - Offcial PyTorch Implementation This rep

CHOI HWAN IL 23 Dec 20, 2022
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

francis 125 Dec 30, 2022
QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021)

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 119 Dec 02, 2022
PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

About PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV) Colorizor Приложение для проекта Yand

1 Apr 04, 2022
Volume Control using OpenCV

Gesture-Volume-Control Volume Control using OpenCV Here i made volume control using Python and OpenCV in which we can control the volume of our laptop

Mudit Sinha 3 Oct 10, 2021