Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Overview

LayoutAnalysisEvaluator

Layout Analysis Evaluator for:

Minimal usage: java -jar LayoutAnalysisEvaluator.jar -gt gt_image.png -p prediction_image.png

Parameters list: utility-name

 -gt,--groundTruth <arg>      Ground Truth image 
 -p,--prediction <arg>        Prediction image 
 -o,--original <arg>          (Optional) Original image, to be overlapped with the results visualization
 -j,--json <arg>              (Optional) Json Path, for the DIVAServices JSON output
 -out,--outputPath <arg>      (Optional) Output path (relative to prediction input path)                            
 -dv,--disableVisualization   (Optional)(Flag) Vsualizing the evaluation as image is NOT desired

Note: this also outputs a human-friendly visualization of the results next to the prediction_image.png which can be overlapped to the original image if provided with the parameter -overlap to enable deeper analysis.

Visualization of the results

Along with the numerical results (such as the Intersection over Union (IU), precision, recall,F1) the tool provides a human friendly visualization of the results. Additionally, when desired one can provide the original image and it will be overlapped with the visualization of the results. This is particularly helpful to understand why certain artifacts are created. The three images below represent the three steps: the original image, the visualization of the result and the two overlapped.

Alt text Alt text Alt text

Interpreting the colors

Pixel colors are assigned as follows:

  • GREEN: Foreground predicted correctly
  • YELLOW: Foreground predicted - but the wrong class (e.g. Text instead of Comment)
  • BLACK: Background predicted correctly
  • RED: Background mis-predicted as Foreground
  • BLUE: Foreground mis-predicted as Background

Example of problem hunting

Below there is an example supporting the usefulness of overlapping the prediction quality visualization with the original image. Focus on the red pixels pointed at by the white arrow: they are background pixels mis-classified as foreground. In the normal visualization (left) its not possible to know why would an algorithm decide that in that spot there is something belonging to foreground, as it is clearly far from regular text. However, when overlapped with the original image (right) one can clearly see that in this area there is an ink stain which could explain why the classification algorithm is deceived into thinking these pixel were foreground. This kind of interpretation is obviously not possible without the information provided by the original image like in (right).

Alt text Alt text

Ground Truth Format

The ground truth information needs to be a pixel-label image where the class information is encoded in the blue channel. Red and green channels should be set to 0 with the exception of the boundaries pixels used in the two competitions mentioned above.

For example, in the DIVA-HisDB dataset there are four different annotated classes which might overlap: main text body, decorations, comments and background.

In the pixel-label images the classes are encoded by RGB values as follows:

Red = 0 everywhere (except boundaries)
Green = 0 everywhere

Blue = 0b00...1000 = 0x000008: main text body
Blue = 0b00...0100 = 0x000004: decoration
Blue = 0b00...0010 = 0x000002: comment
Blue = 0b00...0001 = 0x000001: background (out of page)

Note that the GT might contain multi-class labeled pixels, for all classes except for the background. For example:

Blue = 0b...1000 | 0b...0010 = 0b...1010 = 0x00000A : main text body + comment  
Blue = 0b...1000 | 0b...0100 = 0b...1100 = 0x00000C : main text body + decoration
Blue = 0b...0010 | 0b...0100 = 0b...0110 = 0x000006 : comment + decoration

Citing us

If you use our software, please cite our paper as:

@inproceedings{alberti2017evaluation,
    address = {Kyoto, Japan},
    archivePrefix = {arXiv},
    arxivId = {1712.01656},
    author = {Alberti, Michele and Bouillon, Manuel and Ingold, Rolf and Liwicki, Marcus},
    booktitle = {2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
    doi = {10.1109/ICDAR.2017.311},
    eprint = {1712.01656},
    isbn = {978-1-5386-3586-5},
    month = {nov},
    pages = {43--47},
    title = {{Open Evaluation Tool for Layout Analysis of Document Images}},
    year = {2017}
}
You might also like...
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

CellProfiler is a open-source application for biological image analysis
CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.

Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled -
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

Layout Parser is a deep learning based tool for document image layout analysis tasks.
G-Research-Crypto-Competition - Project for passing the ML exam. Dataset took from the competition on the kaggle

G-Research-Crypto-Competition Project for passing the ML exam. Dataset took from

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021
Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

🐯 SynthTIGER: Synthetic Text Image GEneratoR Official implementation of SynthTIGER | Paper | Datasets Moonbin Yim1, Yoonsik Kim1, Han-cheol Cho1, Sun

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

A mathematica expression evaluator with PokemonTypes

A simple mathematical expression evaluator that uses Pokemon types to replace symbols.

The evaluator covering all of the metrics required by tasks within the DUE Benchmark.

DUE Evaluator The repository contains the evaluator covering all of the metrics required by tasks within the DUE Benchmark, i.e., set-based F1 (for KI

Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.
Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.

Excel Report Evaluator Simple Python GUI with Tkinter for evaluating Microsoft Excel reports (.xlsx-Files). Usage Start main.py and choose one of the

 Binance Smart Chain Contract Scraper + Contract Evaluator
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy

Binance Smart Chain Contract Scraper + Contract Evaluator
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy

Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

IMGUR5K handwriting set. It is a handwritten in-the-wild dataset, which contains challenging real world handwritten samples from different writers.The dataset is shared as a set of image urls with annotations. This code downloads the images and verifies the hash to the image to avoid data contamination.
Releases(v1.0.0)
Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

Maciej Budyś 327 Jan 01, 2023
2 telegram-bots: for image recognition and for text generation

💻 📱 Telegram_Bots 🔎 & 📖 2 telegram-bots: for image recognition and for text generation. About Image recognition bot: User sends a photo and bot de

Marina Polukoshko 1 Jan 27, 2022
Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

SSTDNet Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight' using pytorch. This code is work for general object detecti

HotaekHan 84 Jan 05, 2022
TextField: Learning A Deep Direction Field for Irregular Scene Text Detection (TIP 2019)

TextField: Learning A Deep Direction Field for Irregular Scene Text Detection Introduction The code and trained models of: TextField: Learning A Deep

Yukang Wang 101 Dec 12, 2022
A simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dropbox account at every 5 seconds

Security Camera using Opencv & Dropbox This is a simple Security Camera created using Opencv in Python where images gets saved in realtime in your Dro

Arpit Rath 1 Jan 31, 2022
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

Matt Zucker 1.2k Dec 29, 2022
OCR software for recognition of handwritten text

Handwriting OCR The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vis

Břetislav Hájek 562 Jan 03, 2023
Controlling the computer volume with your hands // OpenCV

HandsControll-AI Controlling the computer volume with your hands // OpenCV Step 1 git clone https://github.com/Hayk-21/HandsControll-AI.git pip instal

Hayk 1 Nov 04, 2021
Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Siva Prakash 11 Jan 02, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Total-Text-Dataset (Official site) Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.) Update

Chee Seng Chan 671 Dec 27, 2022
SRA's seminar on Introduction to Computer Vision Fundamentals

Introduction to Computer Vision This repository includes basics to : Python Numpy: A python library Git Computer Vision. The aim of this repository is

Society of Robotics and Automation 147 Dec 04, 2022
Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

Rafik Boudalia 2 Oct 13, 2021
零样本学习测评基准,中文版

ZeroCLUE 零样本学习测评基准,中文版 零样本学习是AI识别方法之一。 简单来说就是识别从未见过的数据类别,即训练的分类器不仅仅能够识别出训练集中已有的数据类别, 还可以对于来自未见过的类别的数据进行区分。 这是一个很有用的功能,使得计算机能够具有知识迁移的能力,并无需任何训练数据, 很符合现

CLUE benchmark 27 Dec 10, 2022
Some codes from PyImageSearch course's and external projects.

👨‍💻 Some codes and projects 👨‍💻 💡 Technologies 📜 Projects 📍 Chrome Dinosaur Controller 📦 Script 📍 Coins Counter 📦 Script 🤓 Author Lucas Biv

Lucas Bivar 25 Oct 24, 2021
Assignment work with webcam

work with webcam : Press key 1 to use emojy on your face Press key 2 to use lip and eye on your face Press key 3 to checkered your face Press key 4 to

Hanane Kheirandish 2 May 31, 2022
Image processing in Python

scikit-image: Image processing in Python Website (including documentation): https://scikit-image.org/ Mailing list: https://mail.python.org/mailman3/l

Image Processing Toolbox for SciPy 5.2k Dec 30, 2022
Automatically remove the mosaics in images and videos, or add mosaics to them.

Automatically remove the mosaics in images and videos, or add mosaics to them.

Hypo 1.4k Dec 30, 2022