Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

Overview

SceneTextPapers

Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized

Information about this repository

This repo serves as a complement to our IJCV paper:

Citing this work

If you find this paper helpful in understanding the latest history of scene text detection&recognition algorithms as well as designing new ones , you are highly encouraged (though not required) to cite our paper

@article{long2020scene,
  title={Scene text detection and recognition: The deep learning era},
  author={Long, Shangbang and He, Xin and Yao, Cong},
  journal={International Journal of Computer Vision},
  pages={1--24},
  year={2020},
  publisher={Springer}
}

Papers

I. Other Survey Papers:

  1. Scene text detection and recognition: Recent advances and future trends. Zhu, Yingying and Yao, Cong and Bai, Xiang. Frontiers of Computer Science, 2016[paper]
  2. Text detection, tracking and recognition in video: A comprehensive survey. Yin, Xu-Cheng and Zuo, Ze-Yu and Tian, Shu and Liu, Cheng-Lin. TIP, 2016 [paper]
  3. Text detection and recognition in imagery: A survey. Ye, Qixiang and Doermann, David. TPAMI, 2015 [paper]
  4. Text localization and recognition in images and video. Uchida, Seiichi. 2014 [paper]

II. Main: Scene Text Detection and Recognition

2.1 Detection

2.1.1 Pipeline Simplification
Anchor-based methods
  1. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  4. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  5. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
Region proposal methods
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  3. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  4. weakly supervised text attention network for generating text proposals in scene images. Rong, Li and MengYi, En and JianQiang, Li and HaiBin, Zhang. ICDAR, 2017 [paper]
  5. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
  6. Feature Enhancement Network: A Refined Scene Text Detector. Sheng, Zhang and Yuliang, Liu and Lianwen, Jin and Canjie, Luo. AAAI, 2017 [paper]
2.1.2 Differnt Prediction Units
Text instance level
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
  4. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  5. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  6. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  7. Deep Direct Regression for Multi-Oriented Scene Text Detection. He, Wenhao and Zhang, Xu-Yao and Yin, Fei and Liu, Cheng-Lin. ICCV, 2017 [paper]
  8. Fused Text Segmentation Networks for Multi-oriented Scene Text Detection. Dai, Yuchen and Huang, Zheng and Gao, Yuting and Chen, Kai. 2017 [paper]
  9. Feature Enhancement Network: A Refined Scene Text Detector. Sheng, Zhang and Yuliang, Liu and Lianwen, Jin and Canjie, Luo. AAAI, 2017 [paper]
  10. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
Bottom-up (Pixel)
  1. Scene text detection via holistic, multi-channel prediction. Yao, Cong and Bai, Xiang and Sang, Nong and Zhou, Xinyu and Zhou, Shuchang and Cao, Zhimin. 2016 [paper]
  2. Multi-oriented text detection with fully convolutional networks. Zhang, Zheng and Zhang, Chengquan and Shen, Wei and Yao, Cong and Liu, Wenyu and Bai, Xiang. CVPR, 2016 [paper] [code]
  3. Self-organized Text Detection with Minimal Post-processing via Border Learning. Wu, Yue and Natarajan, Prem. CVPR, 2017 [paper]
  4. Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild. He, Dafang and Yang, Xiao and Liang, Chen and Zhou, Zihan and Ororbia, Alexander G and Kifer, Daniel and Giles, C Lee. CVPR, 2017 [paper]
  5. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]
  6. PixelLink: Detecting Scene Text via Instance Segmentation. Dan, Deng and Haifeng, Liu and Xuelong, Li and Deng, Cai. AAAI, 2018 [paper] [code]
Bottom-up (Components)
  1. Detecting text in natural image with connectionist text proposal network. Tian, Zhi and Huang, Weilin and He, Tong and He, Pan and Qiao, Yu. ECCV, 2016 [paper] [code]
  2. Aggregating local context for accurate scene text detection. He, Dafang and Yang, Xiao and Huang, Wenyi and Zhou, Zihan and Kifer, Daniel and Giles, C Lee. ACCV, 2016 [paper]
  3. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  4. Scene Text Detection with Novel Superpixel Based Character Candidate Extraction. Wang, Cong and Yin, Fei and Liu, Cheng-Lin. 2017 [paper]
  5. Deep Residual Text Detection Network for Scene Text. Zhu, Xiangyu and Jiang, Yingying and Yang, Shuli and Wang, Xiaobing and Li, Wei and Fu, Pei and Wang, Hua and Luo, Zhenbo. ICDAR, 2017 [paper]
  6. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. Lyu, Pengyuan and Yao, Cong and Wu, Wenhao and Yan, Shuicheng and Bai, Xiang. CVPR, 2018 [paper]
2.1.3 Specific Targets
Long text
  1. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  2. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  3. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. Lyu, Pengyuan and Yao, Cong and Wu, Wenhao and Yan, Shuicheng and Bai, Xiang. CVPR, 2018 [paper]
Multi-oriented text
  1. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  4. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  5. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  6. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
  7. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
  8. Geometry-Aware Scene Text Detection With Instance Transformation Network. Wang, Fangfang and Zhao, Liming and Li, Xi and Wang, Xinchao and Tao, Dacheng. CVPR, 2018 [paper] [code]
Irregular text
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang. ECCV, 2018 [paper]
  3. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. Long, Shangbang and Ruan, Jiaqiang and Zhang, Wenjie and He, Xin and Wu, Wenhao and Yao, Cong. ECCV, 2018 [paper]
  4. Scene Text Detection with Supervised Pyramid Context Network. Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li. AAAI, 2019 [paper]
  5. Learning Shape-Aware Embedding for Scene Text Detection. Zhuotao Tian, Michelle Shu, Pengyuan Lyu, Ruiyu Li, Chao Zhou, Xiaoyong Shen, Jiaya Jia. CVPR, 2019 [paper]
  6. Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation. Xiaobing Wang, Yingying Jiang, Zhenbo Luo, Cheng-Lin Liu, Hyunsoo Choi, Sungjin Kim. CVPR, 2019 [paper]
  7. Towards Robust Curve Text Detection With Conditional Spatial Expansion. Zichuan Liu, Guosheng Lin, Sheng Yang, Fayao Liu, Weisi Lin, Wang Ling Goh. CVPR, 2019 [paper]
  8. Shape Robust Text Detection With Progressive Scale Expansion Network. Xiang Li, Wenhai Wang, Wenbo Hou, Ruo-Ze Liu, Tong Lu, Jian Yang. CVPR, 2019 [paper]
  9. Character Region Awareness for Text Detection. Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee. CVPR, 2019 [paper]
  10. Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes. Chengquan Zhang, Borong Liang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding, Xinghao Ding. CVPR, 2019 [paper]
  11. Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network. Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua. ICCV, 2019 [paper]
Speed up
  1. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
Easy instance segmentation
  1. Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild. He, Dafang and Yang, Xiao and Liang, Chen and Zhou, Zihan and Ororbia, Alexander G and Kifer, Daniel and Giles, C Lee. CVPR, 2017 [paper]
  2. Self-organized Text Detection with Minimal Post-processing via Border Learning. Wu, Yue and Natarajan, Prem. CVPR, 2017 [paper]
  3. WordFence: Text Detection in Natural Images with Border Awareness. Polzounov, Andrei and Ablavatski, Artsiom and Escalera, Sergio and Lu, Shijian and Cai, Jianfei. ICIP, 2017 [paper]
  4. PixelLink: Detecting Scene Text via Instance Segmentation. Dan, Deng and Haifeng, Liu and Xuelong, Li and Deng, Cai. AAAI, 2018 [paper] [code]
Retrieving designated text
  1. Unambiguous text localization and retrieval for cluttered scenes. Rong, Xuejian and Yi, Chucai and Tian, Yingli. CVPR, 2017 [paper]
Against complex background
  1. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]

2.2 Recognition

2.2.1 CTC based methods
  1. Unconstrained on-line handwriting recognition with recurrent neural networks. Graves, Alex and Liwicki, Marcus and Bunke, Horst and Schmidhuber, Jurgen and Fernandez, Santiago. NIPS, 2008 [paper]
  2. Accurate scene text recognition based on recurrent neural network. Su, Bolan and Lu, Shijian. ACCV, 2014 [paper]
  3. STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition. Liu, Wei and Chen, Chaofeng and Wong, Kwan-Yee K and Su, Zhizhong and Han, Junyu. BMVC, 2016 [paper]
  4. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. Shi, Baoguang and Bai, Xiang and Yao, Cong. TPAMI, 2017 [paper] [code]
  5. Reading Scene Text with Attention Convolutional Sequence Modeling. Gao, Yunze and Chen, Yingying and Wang, Jinqiao and Lu, Hanqing. 2017 [paper],
  6. Scene Text Recognition with Sliding Convolutional Character Models. Yin, Fei and Wu, Yi-Chao and Zhang, Xu-Yao and Liu, Cheng-Lin. 2017 [paper]
2.2.2 Attention based methods
  1. Robust scene text recognition with automatic rectification. Shi, Baoguang and Wang, Xinggang and Lyu, Pengyuan and Yao, Cong and Bai, Xiang. CVPR, 2016 [paper]
  2. Recursive recurrent nets with attention modeling for ocr in the wild. Lee, Chen-Yu and Osindero, Simon. CVPR, 2016 [paper]
  3. Visual attention models for scene text recognition. Ghosh, Suman K and Valveny, Ernest and Bagdanov, Andrew D. ICDAR, 2017 [paper]
  4. Focusing Attention: Towards Accurate Text Recognition in Natural Images. Cheng, Zhanzhan and Bai, Fan and Xu, Yunlu and Zheng, Gang and Pu, Shiliang and Zhou, Shuigeng. ICCV, 2017 [paper]
  5. Learning to Read Irregular Text with Attention Mechanisms. Yang, Xiao and He, Dafang and Zhou, Zihan and Kifer, Daniel and Giles, C Lee. IJCAI, 2017 [paper]
  6. Arbitrarily-Oriented Text Recognition. Cheng, Zhanzhan and Liu, Xuyang and Bai, Fan and Niu, Yi and Pu, Shiliang and Zhou, Shuigeng. CVPR, 2017 [paper]
  7. Edit Probability for Scene Text Recognition., Bai, Fan and Cheng, Zhanzhan and Niu, Yi and Pu, Shiliang and Zhou, Shuigeng. CVPR, 2018 [paper]
  8. SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network. Liu, Zichuan and Li, Yixing and Ren, Fengbo and Yu, Hao and Goh, Wangling. AAAI, 2018 [paper]
  9. Show, attend and read: a simple and strong baseline for recognising irregular text. Hui Li, Peng Wang, Chunhua Shen, Guyu Zhang. AAAI, 2019 [paper]
  10. Scene Text Recognition from Two-Dimensional Perspective. Minghui Liao, Jian Zhang, Zhaoyi Wan, Fengming Xie, Jiajun Liang, Pengyuan Lyu, Cong Yao, Xiang Bai. AAAI, 2019 [paper]
  11. ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification. Fangneng Zhan, Shijian Lu. CVPR, 2019 [paper]
  12. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis. Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk. ICCV, 2019 [paper]
  13. Symmetry-Constrained Rectification Network for Scene Text Recognition. Yang, Mingkun and Guan, Yushuo and Liao, Minghui and He, Xin and Bian, Kaigui and Bai, Song and Yao, Cong and Bai, Xiang. ICCV, 2019 [paper]

2.3 End-to-End Text Spotting

2.3.1 Separately Trained Two-Stage Methods
  1. Reading text in the wild with convolutional neural networks. Jaderberg, Max and Simonyan, Karen and Vedaldi, Andrea and Zisserman, Andrew. IJCV, 2016 [paper]
  2. Synthetic data for text localisation in natural images. Gupta, Ankush and Vedaldi, Andrea and Zisserman, Andrew. CVPR, 2016 [paper] [code]
  3. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
2.3.2 Jointly Trained Two-Stage Methods
  1. SEE: Towards Semi-Supervised End-to-End Scene Text Recognition. Bartz, Christian and Yang, Haojin and Meinel, Christoph. 2017 [paper] [code]
  2. Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework. Busta, Michal and Neumann, Lukas and Matas, Jiri. ICCV, 2017 [paper] [code]
  3. Towards End-To-End Text Spotting With Convolutional Recurrent Neural Networks. Li, Hui and Wang, Peng and Shen, Chunhua. ICCV, 2017 [paper]
  4. An End-to-End TextSpotter With Explicit Alignment and Attention. He, Tong and Tian, Zhi and Huang, Weilin and Shen, Chunhua and Qiao, Yu and Sun, Changming. CVPR, 2018 [paper]
  5. FOTS: Fast Oriented Text Spotting with a Unified Network. Liu, Xuebo and Liang, Ding and Yan, Shi and Chen, Dagui and Qiao, Yu and Yan, Junjie. CVPR, 2018 [paper]
  6. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang. ECCV, 2018 [paper]
  7. Towards Unconstrained End-to-End Text Spotting. Qin, Siyang and Bissacco, Alessandro and Raptis, Michalis and Fujii, Yasuhisa and Xiao, Ying. ICCV, 2019 [paper]
  8. TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting. Feng, Wei and He, Wenhao and Yin, Fei and Zhang, Xu-Yao and Liu, Cheng-Lin. ICCV, 2019 [paper]
  9. Convolutional character networks. Xing, Linjie and Tian, Zhi and Huang, Weilin and Scott, Matthew R. ICCV, 2019 [paper]

2.4 Auxilliary Techs

2.4.1 Synthetic Data
  1. Synthetic data and artificial neural networks for natural scene text recognition. Jaderberg, Max and Simonyan, Karen and Vedaldi, Andrea and Zisserman, Andrew. NIPS, 2014 [paper]
  2. Synthetic data for text localisation in natural images. Gupta, Ankush and Vedaldi, Andrea and Zisserman, Andrew. CVPR, 2016 [paper] [code]
  3. Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes. Zhan, Fangneng and Lu, Shijian and Xue, Chuhui. ECCV, 2018 [paper] [code]
  4. UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World. Long, Shangbang and Yao, Cong. CVPR, 2020, [paper] [code]
2.4.2 Weak/Semi-Supervision
  1. Wetext: Scene text detection under weak supervision. Tian, Shangxuan and Lu, Shijian and Li, Chongshou. ICCV, 2017 [paper]
  2. weakly supervised text attention network for generating text proposals in scene images. Rong, Li and MengYi, En and JianQiang, Li and HaiBin, Zhang. ICDAR, 2017 [paper]
  3. Wordsup: Exploiting word annotations for character based text detection. Hu, Han and Zhang, Chengquan and Luo, Yuxuan and Wang, Yuzhuo and Han, Junyu and Ding, Errui. ICCV, 2018 [paper]
  4. Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning. Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo. ICCV, 2019 [paper]
2.4.3 Deblurring
  1. Convolutional neural networks for direct text deblurring. Hradis, Michal and Kotera, Jan and Zemcik, Pavel and Sroubek, Filip. BMVC, 2015 [paper] [code]
  2. A blind deconvolution model for scene text detection and recognition in video. Khare, Vijeta and Shivakumara, Palaiahnakote and Raveendran, Paramesran and Blumenstein, Michael. PR, 2016 [paper]
2.4.4 Context Information
  1. Could scene context be beneficial for scene text detection? Zhu, Anna and Gao, Renwu and Uchida, Seiichi. PR, 2016 [paper]
2.4.5 Adversarial Attack
  1. Adaptive Adversarial Attack on Scene Text Recognition. Yuan, Xiaoyong and He, Pan and Li, Xiaolin Andy. 2018 [paper]
2.4.6 Evaluation
  1. Tightness-Aware Evaluation Protocol for Scene Text Detection. Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie. CVPR 2019 [paper]

III. Datasets

Dataset (Year) Image Num (train/test) Text Num (train/test) Orientation Language Characteristics Detec/Recog Task
End2End ==== ==== ==== ==== ==== ====
ICDAR03 (2003) 509 (258/251) 2276 (1110/1156) Horizontal En - ✓/✓
ICDAR13 Scene Text(2013) 462 (229/233) - (848/1095) Horizontal En - ✓/✓
ICDAR15 Incidental Text(2015) 1500 (1000/500) - (-/-) Multi-Oriented En Blur, Small, Defocused ✓/✓
ICDAR17 / RCTW (2017) 12263 (8034/4229) - (-/-) Multi-Oriented Cn - ✓/✓
Total-Text (2017) 1555 (1255/300) - (-/-) Multi-Oriented, Curved En, Cn Irregular polygon label ✓/✓
SVT (2010) 350 (100/250) 904 (257/647) Horizontal En - ✓/✓
KAIST (2010) 3000 (-/-) 5000 (-/-) Horizontal En, Ko Distorted ✓/✓
NEOCR (2011) 659 (-/-) 5238 (-/-) Multi-oriented 8 langs - ✓/✓
CUTE (2014) or here 80 (-/80) - (-/-) Curved En - ✓/✓
CTW (2017) 32K ( 25K/6K) 1M ( 812K/205K) Multi-Oriented Cn Fine-grained annotation ✓/✓
CASIA-10K (2018) 10K (7K/3K) - (-/-) Multi-Oriented Cn ✓/✓
Detection Only ==== ==== ==== ==== ==== ====
OSTD (2011) 89 (-/-) 218 (-/-) Multi-oriented En - ✓/-
MSRA-TD500 (2012) 500 (300/200) 1719 (1068/651) Multi-Oriented En, Cn Long text ✓/-
HUST-TR400 (2014) 400 (400/-) - (-/-) Multi-Oriented En, Cn Long text ✓/-
ICDAR17 / RRC-MLT (2017) 18000 (9000/9000) - (-/-) Multi-Oriented 9 langs - ✓/-
CTW1500 (2017) 1500 (1000/500) - (-/-) Multi-Oriented, Curved En Bounding box with 14 vertexes ✓/-
Recognition Only ==== ==== ==== ==== ==== ====
Char74k (2009) 74107 (-/-) 74107 (-/-) Horizontal En, Kannada Character label -/✓
IIIT 5K-Word (2012) 5000 (-/-) 5000 (2000/3000) Horizontal - cropped -/✓
SVHN (2010) - (-/-) 600000 (-/-) Horizontal - House number digits -/✓
SVTP (2013) 639 (-/639) - (-/-) En Distorted -/✓
Owner
Shangbang Long
Shangbang Long
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 02, 2023
Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.

hocr-tools About About the code Installation System-wide with pip System-wide from source virtualenv Available Programs hocr-check -- check the hOCR f

OCRopus 285 Dec 08, 2022
docstrum

Docstrum Algorithm Getting Started This repo is for developing a Docstrum algorithm presented by O’Gorman (1993). Disclaimer This source code is built

Chulwoo Mike Pack 54 Dec 13, 2022
SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

SceneCollisionNet This repo contains the code for "Object Rearrangement Using Learned Implicit Collision Functions", an ICRA 2021 paper. For more info

NVIDIA Research Projects 31 Nov 22, 2022
Neural search engine for AI papers

Papers search Neural search engine for ML papers. Demo Usage is simple: input an abstract, get the matching papers. The following demo also showcases

Giancarlo Fissore 44 Dec 24, 2022
Programa que viabiliza a OCR (Optical Character Reading - leitura óptica de caracteres) de um PDF.

Este programa tem o intuito de ser um modificador de arquivos PDF. Os arquivos PDFs podem ser 3: PDFs verdadeiros - em que podem ser selecionados o ti

Daniel Soares Saldanha 2 Oct 11, 2021
QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021)

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 119 Dec 02, 2022
a deep learning model for page layout analysis / segmentation.

OCR Segmentation a deep learning model for page layout analysis / segmentation. dependencies tensorflow1.8 python3 dataset: uw3-framed-lines-degraded-

99 Dec 12, 2022
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which

162 Jan 05, 2023
Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

CSCBLI Code for our ACL Findings 2021 paper, "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction". Require

Jinpeng Zhang 12 Oct 08, 2022
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Products Recognition 介绍 商品识别,围绕在复杂的商场零售场景中,识别出货架图像中的商品信息。主要组成部分: 重复图像检测。【更新进度 4/10】 图像拼接。【更新进度 0/10】 目标检测。【更新进度 0/10】 商品识别。【更新进度 1/10】 OCR。【更新进度 1/10】

zhenjieWang 18 Jan 27, 2022
An application of high resolution GANs to dewarp images of perturbed documents

Docuwarp This project is focused on dewarping document images through the usage of pix2pixHD, a GAN that is useful for general image to image translat

Thomas Huang 97 Dec 25, 2022
Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

CaptchaSolver - LEIA ISSO 😓 Para iniciar o codigo: pip install -r requirements.txt python captcha_solver.py Se você deseja pegar ver o resultado das

Kawanderson 50 Mar 21, 2022
Image Recognition Model Generator

Takes a user-inputted query and generates a machine learning image recognition model that determines if an inputted image is or isn't their query

Christopher Oka 1 Jan 13, 2022
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
Some bits of javascript to transcribe scanned pages using PageXML

nashi (nasḫī) Some bits of javascript to transcribe scanned pages using PageXML. Both ltr and rtl languages are supported. Try it! But wait, there's m

Andreas Büttner 15 Nov 09, 2022
Handwritten Character Recognition using CNN

Handwritten Character Recognition using CNN Problem Definition The main objective of this project is to solve the problem of handwritten character rec

Mohit Kaushik 4 Mar 02, 2022
Deep LearningImage Captcha 2

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 117 Dec 28, 2022
A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

Dylan Kai Lau 52 Dec 08, 2022