TedEval: A Fair Evaluation Metric for Scene Text Detectors

Overview

TedEval: A Fair Evaluation Metric for Scene Text Detectors

Official Python 3 implementation of TedEval | paper | slides

Chae Young Lee, Youngmin Baek, and Hwalsuk Lee.

Clova AI Research, NAVER Corp.

Overview

We propose a new evaluation metric for scene text detectors called TedEval. Through separate instance-level matching policy and character-level scoring policy, TedEval solves the drawbacks of previous metrics such as IoU and DetEval. This code is based on ICDAR15 official evaluation code.

Methodology

1. Mathcing Policy

  • Non-exclusively gathers all possible matches of not only one-to-one but also one-to-many and many-to-one.
  • The threshold of both area recall and area precision are set to 0.4.
  • Multiline is identified and rejected when |min(theta, 180 - theta)| > 45 from Fig. 2.

2. Scoring Policy

We compute Pseudo Character Center (PCC) from word-level bounding boxes and penalize matches when PCCs are missing or overlapping.

Sample Evaluation

Experiments

We evaluated state-of-the-art scene text detectors with TedEval on two benchmark datasets: ICDAR 2013 Focused Scene Text (IC13) and ICDAR 2015 Incidental Scene Text (IC15). Detectors are listed in the order of published dates.

ICDAR 2013

Detector Date (YY/MM/DD) Recall (%) Precision (%) H-mean (%)
CTPN 16/09/12 82.1 92.7 87.6
RRPN 17/03/03 89.0 94.2 91.6
SegLink 17/03/19 65.6 74.9 70.0
EAST 17/04/11 77.7 87.1 82.5
WordSup 17/08/22 87.5 92.2 90.2
PixelLink 18/01/04 84.0 87.2 86.1
FOTS 18/01/05 91.5 93.0 92.6
TextBoxes++ 18/01/09 87.4 92.3 90.0
MaskTextSpotter 18/07/06 90.2 95.4 92.9
PMTD 19/03/28 94.0 95.2 94.7
CRAFT 19/04/03 93.6 96.5 95.1

ICDAR 2015

Detector Date (YY/MM/DD) Recall (%) Precision (%) H-mean (%)
CTPN 16/09/12 85.0 81.1 67.8
RRPN 17/03/03 79.5 85.9 82.6
SegLink 17/03/19 77.1 83.9 80.6
EAST 17/04/11 82.5 90.0 86.3
WordSup 17/08/22 83.2 87.1 85.2
PixelLink 18/01/04 85.7 86.1 86.0
FOTS 18/01/05 89.0 93.4 91.2
TextBoxes++ 18/01/09 82.4 90.8 86.5
MaskTextSpotter 18/07/06 82.5 91.8 86.9
PMTD 19/03/28 89.2 92.8 91.0
CRAFT 19/04/03 88.5 93.1 90.9

Frequency

Getting Started

Clone repository

git clone https://github.com/clovaai/TedEval.git

Requirements

  • python 3
  • python 3.x Polygon, Bottle, Pillow
# install
pip3 install Polygon3 bottle Pillow

Supported Annotation Type

  • LTRB(xmin, ymin, xmax, ymax)
  • QUAD(x1, y1, x2, y2, x3, y3, x4, y4)

Evaluation

Prepare data

The ground truth and the result data should be text files, one for each sample. Note that the naming rule of each text file is that there must be img_{number} in the filename and that the number indicate the image sample.

# gt/gt_img_38.txt
644,101,932,113,932,168,643,156,[email protected]
477,138,487,139,488,149,477,148,###
344,131,398,130,398,149,344,149,###
1195,148,1277,138,1277,177,1194,187,###
23,270,128,267,128,282,23,284,###

# result/res_img_38.txt
644,101,932,113,932,168,643,156,{Transcription},{Confidence}
477,138,487,139,488,149,477,148
344,131,398,130,398,149,344,149
1195,148,1277,138,1277,177,1194,187
23,270,128,267,128,282,23,284

Compress these text files.

zip gt.zip gt/*
zip result.zip result/*

Refer to gt/result.zip and gt/gt_*.zip for examples.

Run stand-alone evaluation

python script.pyg=gt/gt.zips=result/result.zip
  • Locate the path of GT and submission file using the flag -g and -s, respectively.
  • QUAD annotation type is used as default. To switch between {QUAD, LTRB}, add -p='{"LTRB" = False}' in the command or directly modify the default_evaluation_params() function in script.py.
  • If there are transcription or confidence values in your submission file, add -p='{"CONFIDENCES" = True} or -p='{"TRANSCRIPTION" = True}'.

Run Visualizer

python web.py
  • Place the zip file of images and GTs of the dataset named images.zip and gt.zip, respectively, in the gt directory.
  • Create an empty directory name output. This is where the DB, submission files, and result files will be created.
  • You can change the host and port number in the final line of web.py.

The file structure should then be:

.
├── gt
│   ├── gt.zip
│   └── images.zip
├── output   # empty dir
├── script.py
├── web.py
├── README.md
└── ...

Citation

@article{lee2019tedeval,
  title={TedEval: A Fair Evaluation Metric for Scene Text Detectors},
  author={Lee, Chae Young and Baek, Youngmin and Lee, Hwalsuk},
  journal={arXiv preprint arXiv:1907.01227},
  year={2019}
}

Contact us

We welcome any feedbacks to our metric. Please contact the authors via {cylee7133, youngmin.baek, hwalsuk.lee}@gmail.com. In case of code errors, open an issue and we will get to you.

License

Copyright (c) 2019-present NAVER Corp.

 Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

 The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
Owner
Clova AI Research
Open source repository of Clova AI Research, NAVER & LINE
Clova AI Research
A small C++ implementation of LSTM networks, focused on OCR.

clstm CLSTM is an implementation of the LSTM recurrent neural network model in C++, using the Eigen library for numerical computations. Status and sco

Tom 794 Dec 30, 2022
Deskewing images with slanted content

skew_correction De-skewing images with slanted content by finding the deviation using Canny Edge Detection. To Run: In python 3.6, from deskew import

13 Aug 27, 2022
Kornia is a open source differentiable computer vision library for PyTorch.

Open Source Differentiable Computer Vision Library

kornia 7.6k Jan 06, 2023
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022
Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Clova AI Research 2.5k Jan 03, 2023
Image Smoothing and Blurring Using OpenCV

Image-Smoothing-and-Blurring-Using-OpenCV This repository contains codes for performing image smoothing and blurring using OpenCV. There are different

Happy N. Monday 3 Feb 15, 2022
Characterizing possible failure modes in physics-informed neural networks.

Characterizing possible failure modes in physics-informed neural networks This repository contains the PyTorch source code for the experiments in the

Aditi Krishnapriyan 55 Jan 02, 2023
基于Paddle框架的PSENet复现

PSENet-Paddle 基于Paddle框架的PSENet复现 本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 AIStudio链接 参考项目: whai362-PSENet 环境配置 本项目

QuanHao Guo 4 Apr 24, 2022
A python screen recorder for low-end computers, provides high quality video output.

RecorderX - v1.0 A screen recorder made in Python with the help of OpenCv, it has ability to record your screen in high quality. No matter what your P

Priyanshu Jindal 4 Nov 10, 2021
Zoom , GoogleMeets에서 Vtuber 데뷔하기

EasyVtuber Facial landmark와 GAN을 이용한 Character Face Generation Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요! 악세사리는 어느정도 추가해도 잘 작동해요! 안타깝게도 RTX 2070

Gunwoo Han 140 Dec 23, 2022
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.8k Jan 04, 2023
Deep learning based page layout analysis

Deep Learning Based Page Layout Analyze This is a Python implementaion of page layout analyze tool. The goal of page layout analyze is to segment page

186 Dec 29, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 03, 2023
This project is basically to draw lines with your hand, using python, opencv, mediapipe.

Paint Opencv 📷 This project is basically to draw lines with your hand, using python, opencv, mediapipe. Screenshoots 📱 Tools ⚙️ Python Opencv Mediap

Williams Ismael Bobadilla Torres 3 Nov 17, 2021
Read-only mirror of https://gitlab.gnome.org/GNOME/ocrfeeder

================================= OCRFeeder - A Complete OCR Suite ================================= OCRFeeder is a complete Optical Character Recogn

GNOME Github Mirror 81 Dec 23, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
TableBank: A Benchmark Dataset for Table Detection and Recognition

TableBank TableBank is a new image-based table detection and recognition dataset built with novel weak supervision from Word and Latex documents on th

844 Jan 04, 2023
This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Pinch-zoom This is a python project based on real-time hand-gesture detection, to zoom in or out, using the distance between the index finger and the

Harshit Bhalla 6 Jul 11, 2022
ARU-Net - Deep Learning Chinese Word Segment

ARU-Net: A Neural Pixel Labeler for Layout Analysis of Historical Documents Contents Introduction Installation Demo Training Introduction This is the

128 Sep 12, 2022
A curated list of awesome synthetic data for text location and recognition

awesome-SynthText A curated list of awesome synthetic data for text location and recognition and OCR datasets. Text location SynthText SynthText_Chine

Tianzhong 283 Jan 05, 2023