Scene-Text-Understanding

Survey

[2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper
[2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends paper
[2020-Arxiv] Text Recognition in the Wild: A surveypaper

Scene Text Detection

[2019-CVPR] Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation [paper]
[2019-CVPR] A Multitask Network for Localization and Recognition of Text in Images(end-to-end) [paper]
[2019-CVPR] AFDM: Handwriting Recognition in Low-resource Scripts using Adversarial Learning(data augmentation) [paper] [code]
[2019-CVPR] CRAFT: Character Region Awareness for Text Detection [paper] [code]
[2019-CVPR] Data Extraction from Charts via Single Deep Neural Network(*) [paper]
[2019-CVPR] E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text [paper]
[2019-arXiv] FACLSTM: ConvLSTM with Focused Attention for Scene Text Recognition [paper]
[2019-CVPR] Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes [paper]
[2019-CVPR] PSENET: Shape Robust Text Detection with Progressive Scale Expansion Network [paper][tensorflow][Pytorch]
[2019-CVPR] PMTD: Pyramid Mask Text Detector [paper] [code]
[2019-CVPR] Spatial Fusion GAN for Image Synthesis (word Synthesis) [[paper]](https://arxiv.org/abs/1812.05840 [code]
[2019-CVPR] Scene Text Detection with Supervised Pyramid Context Network [paper][keras]
[2019-arXiv] TextField: Learning A Deep Direction Field for Irregular Scene Text Detection [paper] [code]
[2019-CVPR] Typography with Decor: Intelligent Text Style Transfer [paper] [code]
[2019-CVPR] TIOU: Tightness-aware Evaluation Protocol for Scene Text Detection(new Evalution tool)[paper] [code]
[2019-arXiv] MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition [paper] [code]
[2019-CVPR] Scene Text Magnifier [paper]
[2018-CVPR] Pixel-Anchor: A Fast Oriented Scene Text Detector with Combined Networks [paper]
[2018-ECCV] Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes [paper] [code]
[2018-AAAI] PixelLink: Detecting Scene Text via Instance Segmentation [paper] [code]
[2018-CVPR] RRPN: Arbitrary-Oriented Scene Text Detection via Rotation Proposals [paper] [code]
[2018-CPVR] Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation [Paper]
[2018-arxiv] PixelLink: Detecting Scene Text via Instance Segmentation [Paper]
[2018-AAAI] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition [Paper]
[2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector[Paper]
[2017-arxiv] Attention-based Extraction of Structured [Paper]
[2017-ICCV]Single Shot TextDetector with Regional Attention [Paper]
[2017-ICCV]WordSup: Exploiting Word Annotations for Character based Text Detection [Paper]
[2017-arXiv]R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection[Paper]
[2017-CVPR]EAST: An Efficient and Accurate Scene Text Detector [Paper] [Code]
[2017-arXiv]Cascaded Segmentation-Detection Networks for Word-Level Text Spotting[Paper]
[2017-arXiv]Deep Direct Regression for Multi-Oriented Scene Text Detection [Paper]
[2017-CVPR]Detecting oriented text in natural images by linking segments [Paper]
[2017-CVPR]Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [Paper]
[2017-arXiv]Arbitrary-Oriented Scene Text Detection via Rotation Proposals [Paper]
[2017-AAAI]TextBoxes: A Fast Text Detector with a Single Deep Neural Network[Paper][Code]
[2016-arXiv]Accurate Text Localization in Natural Image with Cascaded Convolutional TextNetwork [Paper]
[2016-arXiv]DeepText : A Unified Framework for Text Proposal Generation and Text Detectionin Natural Images [Paper] [Data]
[2017-PR]TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild [paper] [code]
[2016-arXiv] Scene Text Detection via Holistic, Multi-Channel Prediction [Paper]
[2016-CVPR] CannyText Detector: Fast and Robust Scene Text Localization Algorithm [Paper]
[2016-CVPR]Synthetic Data for Text Localisation in Natural Images[Paper] [Data] [Code]
[2016-ECCV]Detecting Text in Natural Image with Connectionist Text Proposal Network[Paper] [Demo][Code]
[2016-TIP]Text-Attentional Convolutional Neural Networks for Scene Text Detection[Paper]
[2016-IJDAR]TextCatcher: a method to detect curved and challenging text in natural scenes[Paper]
[2016-CVPR]Multi-oriented text detection with fully convolutional networks[Paper]
[2015-TPRMI]Real-time Lexicon-free Scene Text Localization and Recognition
[2015-CVPR]Symmetry-Based Text Line Detection in Natural Scenes
[2015-ICCV]FASText: Efficient unconstrained scene text detector [Paper] https://github.com/MichalBusta/FASText
[2015-D.PhilThesis] Deep Learning for Text Spotting [Paper]
[2015 ICDAR]Object Proposals for Text Extraction in the Wild [Paper] https://github.com/lluisgomez/TextProposals
[2014-ECCV] Deep Features for Text Spotting [Paper] https://bitbucket.org/jaderberg/eccv2014_textspotting https://bitbucket.org/jaderberg/eccv2014_textspotting http://gitxiv.com/posts/uB4y7QdD5XquEJ69c/deep-features-for-text-spotting
[2014-TPAMI] Word Spotting and Recognition with Embedded Attributes [Paper] http://www.cvc.uab.es/~almazan/index/projects/words-att/index.html https://github.com/almazan/watts
[2014-TPRMI]Robust Text Detection in Natural Scene Images
[2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees [Paper]
[2013-ICCV] Photo OCR: Reading Text in Uncontrolled Conditions [Paper]
[2012-CVPR]Real-time scene text localization and recognition [Paper]
[2010-CVPR]Detecting Text in Natural Scenes with Stroke Width Transform [Paper]

Scene Text Recognition

[2019-CVPR] ESIR: End-to-end Scene Text Recognition via Iterative Image Rectification [paper] [code] [code]
[2019-CVPR] E2E-MLT: an Unconstrained End-to-End Method for Multi-Language Scene Text [paper]
[2018-CVPR] FOTS: Fast [paper]
[2017-ICCV] WeText: Scene Text Detection under Weak Supervision [Paper]
[2017-ICCV] Single Shot Text Detector with Regional Attention [Paper] [Code]
[2017-ICCV] Self-organized Text Detection with Minimal Post-processing via Border Learning [Paper]
[2017-ICCV] Focusing Attention: Towards Accurate Text Recognition in Natural Images [Paper]
[2017-ICCV] Towards End-to-end Text Spotting with Convolutional Recurrent Neural Networks [Paper]
[2017-CVPR] Unambiguous Text Localization and Retrieval for Cluttered Scenes [Paper]
[2017-ICCV] WordSup: Exploiting Word Annotations for Character based Text Detection [Paper]
[2017-ICCV] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework [Paper] [Code]
[2017-arXiv] Cascaded Segmentation-Detection Networks for Word-Level Text Spotting [Paper]
[2017-AAAI] Detection and Recognition of Text Embedding in Online Images via Neural Context Models [Paper] [Code]
[2017-arXiv] Improving Text Proposal for Scene Images with Fully Convolutional Networks [Paper]
[2017-AAAI] TextBoxes: A Fast TextDetector with a Single Deep Neural Network [Paper] [Code] github 代码
[2017-CVPR] Detecting Oriented Text in Natural Images by Linking Segments [Paper]
[2017-arXiv] Arbitrary-Oriented Scene Text Detection via Rotation Proposals [Paper]
[2017-CVPR] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection [Paper]
[2016-arXiv] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images [Paper]
[2017-arvix ] Full-Page TextRecognition : Learning Where to Start and When to Stop https://arxiv.org/pdf/1704.08628.pdf
[2016-AAAI]Reading Scene Text in Deep Convolutional Sequences [Paper]
[2016-IJCV]Reading Text in the Wild with Convolutional Neural Networks [Paper] http://zeus.robots.ox.ac.uk/textsearch/#/search/ http://www.robots.ox.ac.uk/~vgg/research/text
[2016-CVPR]Recursive Recurrent Nets with Attention Modeling for OCR in the Wild [Paper]
[2016-CVPR] Robust Scene Text Recognition with Automatic Rectification [Paper]
[2016-NIPs] Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data [Paper]
[2015-CoRR] AnEnd-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [Paper] https://github.com/bgshih/crnn
[2015-ICDAR]Automatic Script Identification in the Wild [Paper]
[2015-ICLR] Deep structured output learning for unconstrained text recognition [Paper]
[2014-NIPS]Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition [Paper] http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14c/ http://www.robots.ox.ac.uk/~vgg/research/text/model_release.tar.gz
[2014-TIP] A Unified Framework for Multi-Oriented Text Detection and Recognition
[2012-ICPR]End-to-End Text Recognition with Convolutional Neural Networks [Paper] http://cs.stanford.edu/people/twangcat/ICPR2012_code/SceneTextCNN_demo.tar http://ufldl.stanford.edu/housenumbers/

Phd Thesis

[2016-PhD Thesis] Context Modeling for Semantic Text Matching and Scene Text Detection [Paper]
[2015-PhD Thesis] Deep Learning for Text Spotting [Paper]
[2012-PhD thesis] End-to-End Text Recognition with Convolutional Neural Networks [Paper]

Text Detection

[2018-arxiv] TextBoxes++: A Single-Shot Oriented Scene Text Detector [Paper]

Dataset

PowerPoint Text Detection and Recognition Dataset 2017

COCO-Text (ComputerVision Group, Cornell) 2016

63,686images, 173,589 text instances, 3 fine-grained text attributes.
Task:text location and recognition

COCO-Text API

Synthetic Data for Text Localisation in Natural Image (VGG)2016

800k thousand images
8 million synthetic word instances
download

Synthetic Word Dataset (Oxford, VGG) 2014

9million images covering 90k English words
Task:text recognition, segmentation
download

IIIT 5K-Words 2012

5000images from Scene Texts and born-digital (2k training and 3k testing images)
Eachimage is a cropped word image of scene text with case-insensitive labels
Task:text recognition
download

StanfordSynth(Stanford, AI Group) 2012

Small single-character images of 62 characters (0-9, a-z, A-Z)
Task:text recognition
download

MSRA Text Detection 500 Database(MSRA-TD500) 2012

500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
Chinese,English or mixture of both
Task:text detection

Street View Text (SVT) 2010

350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
Only word level bounding boxes are provided with case-insensitive labels
Task:text location

KAIST Scene_Text Database 2010

3000 images of indoor and outdoor scenes containing text
Korean,English (Number), and Mixed (Korean + English + Number)
Task:text location, segmentation and recognition

Chars74k 2009

Over 74K images from natural images, as well as a set of synthetically generatedcharacters
Smallsingle-character images of 62 characters (0-9, a-z, A-Z)
Task:text recognition
ICDAR Benchmark Datasets

Dataset	Discription	Competition Paper
ICDAR 2017	42618 training images and 9837 testing images	`paper`
ICDAR 2015	1000 training images and 500 testing images	`paper`
ICDAR 2013	229 training images and 233 testing images	`paper`
ICDAR 2011	229 training images and 255 testing images	`paper`
ICDAR 2005	1001 training images and 489 testing images	`paper`
ICDAR 2003	181 training images and 251 testing images(word level and character level)	`paper`

Blogs

Online Service

Name	Description
Online OCR	API，Free
Free OCR	API，Free
New OCR	API，Free
ABBYY FineReader Online	nonAPI，free

Open Resources Code

本项目基于yolo3 与crnn 实现中文自然场景文字检测及识别 [code]
超轻量级中文ocr，支持竖排文字识别, 支持ncnn推理 , psenet(8.5M) + crnn(6.3M) + anglenet(1.5M) 总模型仅17M [code]
Tesseract c++ based tools for documents analysis and OCR [code]
Ocropy: Python-based tools for document analysis and OCR https://github.com/tmbdev/ocropy
CLSTM A small implementation of LSTM networks,focused on OCR https://github.com/tmbdev/clstm
Convolutional Recurrent Neural Network Torch7 https://github.com/bgshih/crnn
Attention-OCR Visual Attention based OCR https://github.com/da03/Attention-OCR
Umaru: An OCR-system based on torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm https://github.com/edward-zhu/umaru
AKSHAYUBHAT/DeepVideoAnalytics (CTPN+CRNN) code
ankush-me/SynthText code
JarveeLee/SynthText_Chinese_version code

Hand Writing Recognition

[2016-arXiv]Drawingand Recognizing Chinese Characters with Recurrent Neural Network https://arxiv.org/abs/1606.06539
Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition https://arxiv.org/abs/1610.02616
Stroke Sequence-Dependent Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition https://arxiv.org/abs/1610.04057
High Performance Offline Handwritten Chinese Character Recognition Using GoogLeNet and Directional Feature Maps http://arxiv.org/abs/1505.04925">
DeepHCCR:Offline Handwritten Chinese Character Recognition based on GoogLeNet and AlexNet (With CaffeModel) https://github.com/chongyangtao/DeepHCCR">
Scan,Attend and Read: End-to-End Handwritten Paragraph Recognition with MDLSTMAttention http://arxiv.org/abs/1604.03286
MLPaint:the Real-Time Handwritten Digit Recognizer http://blog.mldb.ai/blog/posts/2016/09/mlpaint/
caffe-ocr: OCR with caffe deep learning framework https://github.com/pannous/caffe-ocr

Licence Tag Recognition

ReadingCar License Plates Using Deep Convolutional Neural Networks and LSTMs
Numberplate recognition with Tensorflow http://matthewearl.github.io/2016/05/06/cnn-anpr/
end-to-end-for-plate-recognition href="https://github.com/szad670401/end-to-end-for-chinese-plate-recognitionbhttp://rnd.azoft.com/applying-ocr-technology-receipt-recognition/

OCR, Scene-Text-Understanding, Text Recognition

Related tags

Overview

Scene-Text-Understanding

Survey

Scene Text Detection

Scene Text Recognition

Phd Thesis

Text Detection

Dataset

Blogs

Online Service

Open Resources Code

Hand Writing Recognition

Licence Tag Recognition

Owner

Alan Tang

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

Code for CVPR 2022 paper "Bailando: 3D dance generation via Actor-Critic GPT with Choreographic Memory"

FOTS Pytorch Implementation

CNN+Attention+Seq2Seq

A simple document layout analysis using Python-OpenCV

This repository contains codes on how to handle mouse event using OpenCV

TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

Python Computer Vision application that allows users to draw/erase on the screen using their webcam.

かの有名なあの東方二次創作ソング、「bad apple!」のMVをPythonでやってみたって話

GDB python tool to pretty print and debug c++ xtensor containers

Msos searcher - A half-hearted attempt at finding a magic square of squares

Pre-Recognize Library - library with algorithms for improving OCR quality.

A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

An application of high resolution GANs to dewarp images of perturbed documents

👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Rest API Written In Python To Classify NSFW Images.

Deep LearningImage Captcha 2

Fully-automated scripts for collecting AI-related papers

Scene text detection and recognition based on Extremal Region(ER)