Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Overview

LayoutAnalysisEvaluator

Layout Analysis Evaluator for:

Minimal usage: java -jar LayoutAnalysisEvaluator.jar -gt gt_image.png -p prediction_image.png

Parameters list: utility-name

 -gt,--groundTruth <arg>      Ground Truth image 
 -p,--prediction <arg>        Prediction image 
 -o,--original <arg>          (Optional) Original image, to be overlapped with the results visualization
 -j,--json <arg>              (Optional) Json Path, for the DIVAServices JSON output
 -out,--outputPath <arg>      (Optional) Output path (relative to prediction input path)                            
 -dv,--disableVisualization   (Optional)(Flag) Vsualizing the evaluation as image is NOT desired

Note: this also outputs a human-friendly visualization of the results next to the prediction_image.png which can be overlapped to the original image if provided with the parameter -overlap to enable deeper analysis.

Visualization of the results

Along with the numerical results (such as the Intersection over Union (IU), precision, recall,F1) the tool provides a human friendly visualization of the results. Additionally, when desired one can provide the original image and it will be overlapped with the visualization of the results. This is particularly helpful to understand why certain artifacts are created. The three images below represent the three steps: the original image, the visualization of the result and the two overlapped.

Alt text Alt text Alt text

Interpreting the colors

Pixel colors are assigned as follows:

  • GREEN: Foreground predicted correctly
  • YELLOW: Foreground predicted - but the wrong class (e.g. Text instead of Comment)
  • BLACK: Background predicted correctly
  • RED: Background mis-predicted as Foreground
  • BLUE: Foreground mis-predicted as Background

Example of problem hunting

Below there is an example supporting the usefulness of overlapping the prediction quality visualization with the original image. Focus on the red pixels pointed at by the white arrow: they are background pixels mis-classified as foreground. In the normal visualization (left) its not possible to know why would an algorithm decide that in that spot there is something belonging to foreground, as it is clearly far from regular text. However, when overlapped with the original image (right) one can clearly see that in this area there is an ink stain which could explain why the classification algorithm is deceived into thinking these pixel were foreground. This kind of interpretation is obviously not possible without the information provided by the original image like in (right).

Alt text Alt text

Ground Truth Format

The ground truth information needs to be a pixel-label image where the class information is encoded in the blue channel. Red and green channels should be set to 0 with the exception of the boundaries pixels used in the two competitions mentioned above.

For example, in the DIVA-HisDB dataset there are four different annotated classes which might overlap: main text body, decorations, comments and background.

In the pixel-label images the classes are encoded by RGB values as follows:

Red = 0 everywhere (except boundaries)
Green = 0 everywhere

Blue = 0b00...1000 = 0x000008: main text body
Blue = 0b00...0100 = 0x000004: decoration
Blue = 0b00...0010 = 0x000002: comment
Blue = 0b00...0001 = 0x000001: background (out of page)

Note that the GT might contain multi-class labeled pixels, for all classes except for the background. For example:

Blue = 0b...1000 | 0b...0010 = 0b...1010 = 0x00000A : main text body + comment  
Blue = 0b...1000 | 0b...0100 = 0b...1100 = 0x00000C : main text body + decoration
Blue = 0b...0010 | 0b...0100 = 0b...0110 = 0x000006 : comment + decoration

Citing us

If you use our software, please cite our paper as:

@inproceedings{alberti2017evaluation,
    address = {Kyoto, Japan},
    archivePrefix = {arXiv},
    arxivId = {1712.01656},
    author = {Alberti, Michele and Bouillon, Manuel and Ingold, Rolf and Liwicki, Marcus},
    booktitle = {2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
    doi = {10.1109/ICDAR.2017.311},
    eprint = {1712.01656},
    isbn = {978-1-5386-3586-5},
    month = {nov},
    pages = {43--47},
    title = {{Open Evaluation Tool for Layout Analysis of Document Images}},
    year = {2017}
}
You might also like...
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

CellProfiler is a open-source application for biological image analysis
CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.

Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled -
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

Layout Parser is a deep learning based tool for document image layout analysis tasks.
G-Research-Crypto-Competition - Project for passing the ML exam. Dataset took from the competition on the kaggle

G-Research-Crypto-Competition Project for passing the ML exam. Dataset took from

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021
Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

🐯 SynthTIGER: Synthetic Text Image GEneratoR Official implementation of SynthTIGER | Paper | Datasets Moonbin Yim1, Yoonsik Kim1, Han-cheol Cho1, Sun

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

A mathematica expression evaluator with PokemonTypes

A simple mathematical expression evaluator that uses Pokemon types to replace symbols.

The evaluator covering all of the metrics required by tasks within the DUE Benchmark.

DUE Evaluator The repository contains the evaluator covering all of the metrics required by tasks within the DUE Benchmark, i.e., set-based F1 (for KI

Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.
Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.

Excel Report Evaluator Simple Python GUI with Tkinter for evaluating Microsoft Excel reports (.xlsx-Files). Usage Start main.py and choose one of the

 Binance Smart Chain Contract Scraper + Contract Evaluator
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy

Binance Smart Chain Contract Scraper + Contract Evaluator
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy

Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

IMGUR5K handwriting set. It is a handwritten in-the-wild dataset, which contains challenging real world handwritten samples from different writers.The dataset is shared as a set of image urls with annotations. This code downloads the images and verifies the hash to the image to avoid data contamination.
Releases(v1.0.0)
ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022
Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

OCR.space OCR Result Checker = Draw OCR overlay on top of image Python tool that takes the OCR.space JSON output as input, and draws an overlay on to

a9t9 4 Oct 18, 2022
Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Sergio Díaz Fernández 1 Jan 13, 2022
A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

Home-Security-Demo A facial recognition program that plays a alarm (mp3 file) when a person is seen in the room. A basic theif using Python and OpenCV

SysKey 4 Nov 02, 2021
This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and flexible design and ready to be integrated right into your system!

Passport-Recogniton-System This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and fle

Mo'men Ashraf Muhamed 7 Jan 04, 2023
Creating of virtual elements of the graphical interface using opencv and mediapipe.

Virtual GUI Creating of virtual elements of the graphical interface using opencv and mediapipe. Element GUI Output Description Button By default the b

Aleksei 4 Jun 16, 2022
Deep learning based page layout analysis

Deep Learning Based Page Layout Analyze This is a Python implementaion of page layout analyze tool. The goal of page layout analyze is to segment page

186 Dec 29, 2022
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

EnergyExpenditure Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper. Additional data for replicating this s

Patrick S 42 Oct 26, 2022
The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".

SpeechDrivesTemplates The official repo for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates". [arxiv

Qian Shenhan 53 Dec 23, 2022
In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.

Virtual Mouse Using OpenCV In this project we will be using the live feed coming from the webcam to create a virtual mouse using hand tracking. Projec

Hassan Shahzad 8 Dec 20, 2022
CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automaticall

CellProfiler 732 Dec 23, 2022
Handwritten_Text_Recognition

Deep Learning framework for Line-level Handwritten Text Recognition Short presentation of our project Introduction Installation 2.a Install conda envi

24 Jul 15, 2022
A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

1 Dec 22, 2021
Apply different text recognition services to images of handwritten documents.

Handprint The Handwritten Page Recognition Test is a command-line program that invokes HTR (handwritten text recognition) services on images of docume

Caltech Library 117 Jan 02, 2023
fishington.io bot with OpenCV and NumPy

fishington.io-bot fishington.io bot with using OpenCV and NumPy bot can continue to fishing fully automatically how to use Open cmd in fishington.io-b

Bahadır Araz 77 Jan 02, 2023
graph learning code for ogb

The final code for OGB Installation Requirements: ogb=1.3.1 torch=1.7.0 torch-geometric=1.7.0 torch-scatter=2.0.6 torch-sparse=0.6.9 Baseline models T

PierreHao 20 Nov 10, 2022
Controlling Volume by Hand Gestures

This program allows the user to control the volume of their device with specific hand gestures involving their thumb and index finger!

Riddhi Bajaj 1 Nov 11, 2021
Natural language detection

Detect the language of text. What’s so cool about franc? franc can support more languages(†) than any other library franc is packaged with support for

Titus 3.8k Jan 02, 2023
QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.

Application-Oriented Performance Benchmarks for Quantum Computing This repository contains a collection of prototypical application- or algorithm-cent

SRI International 67 Nov 30, 2022