Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

Overview

SceneTextPapers

Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized

Information about this repository

This repo serves as a complement to our IJCV paper:

Citing this work

If you find this paper helpful in understanding the latest history of scene text detection&recognition algorithms as well as designing new ones , you are highly encouraged (though not required) to cite our paper

@article{long2020scene,
  title={Scene text detection and recognition: The deep learning era},
  author={Long, Shangbang and He, Xin and Yao, Cong},
  journal={International Journal of Computer Vision},
  pages={1--24},
  year={2020},
  publisher={Springer}
}

Papers

I. Other Survey Papers:

  1. Scene text detection and recognition: Recent advances and future trends. Zhu, Yingying and Yao, Cong and Bai, Xiang. Frontiers of Computer Science, 2016[paper]
  2. Text detection, tracking and recognition in video: A comprehensive survey. Yin, Xu-Cheng and Zuo, Ze-Yu and Tian, Shu and Liu, Cheng-Lin. TIP, 2016 [paper]
  3. Text detection and recognition in imagery: A survey. Ye, Qixiang and Doermann, David. TPAMI, 2015 [paper]
  4. Text localization and recognition in images and video. Uchida, Seiichi. 2014 [paper]

II. Main: Scene Text Detection and Recognition

2.1 Detection

2.1.1 Pipeline Simplification
Anchor-based methods
  1. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  4. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  5. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
Region proposal methods
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  3. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  4. weakly supervised text attention network for generating text proposals in scene images. Rong, Li and MengYi, En and JianQiang, Li and HaiBin, Zhang. ICDAR, 2017 [paper]
  5. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
  6. Feature Enhancement Network: A Refined Scene Text Detector. Sheng, Zhang and Yuliang, Liu and Lianwen, Jin and Canjie, Luo. AAAI, 2017 [paper]
2.1.2 Differnt Prediction Units
Text instance level
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
  4. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  5. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  6. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  7. Deep Direct Regression for Multi-Oriented Scene Text Detection. He, Wenhao and Zhang, Xu-Yao and Yin, Fei and Liu, Cheng-Lin. ICCV, 2017 [paper]
  8. Fused Text Segmentation Networks for Multi-oriented Scene Text Detection. Dai, Yuchen and Huang, Zheng and Gao, Yuting and Chen, Kai. 2017 [paper]
  9. Feature Enhancement Network: A Refined Scene Text Detector. Sheng, Zhang and Yuliang, Liu and Lianwen, Jin and Canjie, Luo. AAAI, 2017 [paper]
  10. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
Bottom-up (Pixel)
  1. Scene text detection via holistic, multi-channel prediction. Yao, Cong and Bai, Xiang and Sang, Nong and Zhou, Xinyu and Zhou, Shuchang and Cao, Zhimin. 2016 [paper]
  2. Multi-oriented text detection with fully convolutional networks. Zhang, Zheng and Zhang, Chengquan and Shen, Wei and Yao, Cong and Liu, Wenyu and Bai, Xiang. CVPR, 2016 [paper] [code]
  3. Self-organized Text Detection with Minimal Post-processing via Border Learning. Wu, Yue and Natarajan, Prem. CVPR, 2017 [paper]
  4. Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild. He, Dafang and Yang, Xiao and Liang, Chen and Zhou, Zihan and Ororbia, Alexander G and Kifer, Daniel and Giles, C Lee. CVPR, 2017 [paper]
  5. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]
  6. PixelLink: Detecting Scene Text via Instance Segmentation. Dan, Deng and Haifeng, Liu and Xuelong, Li and Deng, Cai. AAAI, 2018 [paper] [code]
Bottom-up (Components)
  1. Detecting text in natural image with connectionist text proposal network. Tian, Zhi and Huang, Weilin and He, Tong and He, Pan and Qiao, Yu. ECCV, 2016 [paper] [code]
  2. Aggregating local context for accurate scene text detection. He, Dafang and Yang, Xiao and Huang, Wenyi and Zhou, Zihan and Kifer, Daniel and Giles, C Lee. ACCV, 2016 [paper]
  3. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  4. Scene Text Detection with Novel Superpixel Based Character Candidate Extraction. Wang, Cong and Yin, Fei and Liu, Cheng-Lin. 2017 [paper]
  5. Deep Residual Text Detection Network for Scene Text. Zhu, Xiangyu and Jiang, Yingying and Yang, Shuli and Wang, Xiaobing and Li, Wei and Fu, Pei and Wang, Hua and Luo, Zhenbo. ICDAR, 2017 [paper]
  6. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. Lyu, Pengyuan and Yao, Cong and Wu, Wenhao and Yan, Shuicheng and Bai, Xiang. CVPR, 2018 [paper]
2.1.3 Specific Targets
Long text
  1. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  2. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  3. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation. Lyu, Pengyuan and Yao, Cong and Wu, Wenhao and Yan, Shuicheng and Bai, Xiang. CVPR, 2018 [paper]
Multi-oriented text
  1. R2CNN: rotational region CNN for orientation robust scene text detection. Jiang, Yingying and Zhu, Xiangyu and Wang, Xiaobing and Yang, Shuli and Li, Wei and Wang, Hua and Fu, Pei and Luo, Zhenbo. 2017 [paper]
  2. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
  3. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection. Liu, Yuliang and Jin, Lianwen. CVPR, 2017 [paper]
  4. Arbitrary-Oriented Scene Text Detection via Rotation Proposals. Ma, Jianqi and Shao, Weiyuan and Ye, Hao and Wang, Li and Wang, Hong and Zheng, Yingbin and Xue, Xiangyang. T MULTIMEDIA, 2017 [paper] [code]
  5. Detecting Oriented Text in Natural Images by Linking Segments. Shi, Baoguang and Bai, Xiang and Belongie, Serge. CVPR, 2017 [paper] [code]
  6. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
  7. Rotation-Sensitive Regression for Oriented Scene Text Detection. Liao, Minghui and Zhu, Zhen and Shi, Baoguang and Xia, Gui-song and Bai, Xiang. CVPR, 2018 [paper] [code]
  8. Geometry-Aware Scene Text Detection With Instance Transformation Network. Wang, Fangfang and Zhao, Liming and Li, Xi and Wang, Xinchao and Tao, Dacheng. CVPR, 2018 [paper] [code]
Irregular text
  1. Detecting Curve Text in the Wild: New Dataset and New Solution. Yuliang, Liu and Lianwen, Jin and Shuaitao, Zhang and Sheng, Zhang. 2017 [paper] [code]
  2. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang. ECCV, 2018 [paper]
  3. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. Long, Shangbang and Ruan, Jiaqiang and Zhang, Wenjie and He, Xin and Wu, Wenhao and Yao, Cong. ECCV, 2018 [paper]
  4. Scene Text Detection with Supervised Pyramid Context Network. Enze Xie, Yuhang Zang, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li. AAAI, 2019 [paper]
  5. Learning Shape-Aware Embedding for Scene Text Detection. Zhuotao Tian, Michelle Shu, Pengyuan Lyu, Ruiyu Li, Chao Zhou, Xiaoyong Shen, Jiaya Jia. CVPR, 2019 [paper]
  6. Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation. Xiaobing Wang, Yingying Jiang, Zhenbo Luo, Cheng-Lin Liu, Hyunsoo Choi, Sungjin Kim. CVPR, 2019 [paper]
  7. Towards Robust Curve Text Detection With Conditional Spatial Expansion. Zichuan Liu, Guosheng Lin, Sheng Yang, Fayao Liu, Weisi Lin, Wang Ling Goh. CVPR, 2019 [paper]
  8. Shape Robust Text Detection With Progressive Scale Expansion Network. Xiang Li, Wenhai Wang, Wenbo Hou, Ruo-Ze Liu, Tong Lu, Jian Yang. CVPR, 2019 [paper]
  9. Character Region Awareness for Text Detection. Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee. CVPR, 2019 [paper]
  10. Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes. Chengquan Zhang, Borong Liang, Zuming Huang, Mengyi En, Junyu Han, Errui Ding, Xinghao Ding. CVPR, 2019 [paper]
  11. Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network. Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua. ICCV, 2019 [paper]
Speed up
  1. EAST: An Efficient and Accurate Scene Text Detector. Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun. CVPR, 2017 [paper] [code]
Easy instance segmentation
  1. Multi-scale FCN with Cascaded Instance Aware Segmentation for Arbitrary Oriented Word Spotting in the Wild. He, Dafang and Yang, Xiao and Liang, Chen and Zhou, Zihan and Ororbia, Alexander G and Kifer, Daniel and Giles, C Lee. CVPR, 2017 [paper]
  2. Self-organized Text Detection with Minimal Post-processing via Border Learning. Wu, Yue and Natarajan, Prem. CVPR, 2017 [paper]
  3. WordFence: Text Detection in Natural Images with Border Awareness. Polzounov, Andrei and Ablavatski, Artsiom and Escalera, Sergio and Lu, Shijian and Cai, Jianfei. ICIP, 2017 [paper]
  4. PixelLink: Detecting Scene Text via Instance Segmentation. Dan, Deng and Haifeng, Liu and Xuelong, Li and Deng, Cai. AAAI, 2018 [paper] [code]
Retrieving designated text
  1. Unambiguous text localization and retrieval for cluttered scenes. Rong, Xuejian and Yi, Chucai and Tian, Yingli. CVPR, 2017 [paper]
Against complex background
  1. Single Shot Text Detector With Regional Attention. He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin. ICCV, 2017 [paper] [code]

2.2 Recognition

2.2.1 CTC based methods
  1. Unconstrained on-line handwriting recognition with recurrent neural networks. Graves, Alex and Liwicki, Marcus and Bunke, Horst and Schmidhuber, Jurgen and Fernandez, Santiago. NIPS, 2008 [paper]
  2. Accurate scene text recognition based on recurrent neural network. Su, Bolan and Lu, Shijian. ACCV, 2014 [paper]
  3. STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition. Liu, Wei and Chen, Chaofeng and Wong, Kwan-Yee K and Su, Zhizhong and Han, Junyu. BMVC, 2016 [paper]
  4. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. Shi, Baoguang and Bai, Xiang and Yao, Cong. TPAMI, 2017 [paper] [code]
  5. Reading Scene Text with Attention Convolutional Sequence Modeling. Gao, Yunze and Chen, Yingying and Wang, Jinqiao and Lu, Hanqing. 2017 [paper],
  6. Scene Text Recognition with Sliding Convolutional Character Models. Yin, Fei and Wu, Yi-Chao and Zhang, Xu-Yao and Liu, Cheng-Lin. 2017 [paper]
2.2.2 Attention based methods
  1. Robust scene text recognition with automatic rectification. Shi, Baoguang and Wang, Xinggang and Lyu, Pengyuan and Yao, Cong and Bai, Xiang. CVPR, 2016 [paper]
  2. Recursive recurrent nets with attention modeling for ocr in the wild. Lee, Chen-Yu and Osindero, Simon. CVPR, 2016 [paper]
  3. Visual attention models for scene text recognition. Ghosh, Suman K and Valveny, Ernest and Bagdanov, Andrew D. ICDAR, 2017 [paper]
  4. Focusing Attention: Towards Accurate Text Recognition in Natural Images. Cheng, Zhanzhan and Bai, Fan and Xu, Yunlu and Zheng, Gang and Pu, Shiliang and Zhou, Shuigeng. ICCV, 2017 [paper]
  5. Learning to Read Irregular Text with Attention Mechanisms. Yang, Xiao and He, Dafang and Zhou, Zihan and Kifer, Daniel and Giles, C Lee. IJCAI, 2017 [paper]
  6. Arbitrarily-Oriented Text Recognition. Cheng, Zhanzhan and Liu, Xuyang and Bai, Fan and Niu, Yi and Pu, Shiliang and Zhou, Shuigeng. CVPR, 2017 [paper]
  7. Edit Probability for Scene Text Recognition., Bai, Fan and Cheng, Zhanzhan and Niu, Yi and Pu, Shiliang and Zhou, Shuigeng. CVPR, 2018 [paper]
  8. SqueezedText: A Real-time Scene Text Recognition by Binary Convolutional Encoder-decoder Network. Liu, Zichuan and Li, Yixing and Ren, Fengbo and Yu, Hao and Goh, Wangling. AAAI, 2018 [paper]
  9. Show, attend and read: a simple and strong baseline for recognising irregular text. Hui Li, Peng Wang, Chunhua Shen, Guyu Zhang. AAAI, 2019 [paper]
  10. Scene Text Recognition from Two-Dimensional Perspective. Minghui Liao, Jian Zhang, Zhaoyi Wan, Fengming Xie, Jiajun Liang, Pengyuan Lyu, Cong Yao, Xiang Bai. AAAI, 2019 [paper]
  11. ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification. Fangneng Zhan, Shijian Lu. CVPR, 2019 [paper]
  12. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis. Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk. ICCV, 2019 [paper]
  13. Symmetry-Constrained Rectification Network for Scene Text Recognition. Yang, Mingkun and Guan, Yushuo and Liao, Minghui and He, Xin and Bian, Kaigui and Bai, Song and Yao, Cong and Bai, Xiang. ICCV, 2019 [paper]

2.3 End-to-End Text Spotting

2.3.1 Separately Trained Two-Stage Methods
  1. Reading text in the wild with convolutional neural networks. Jaderberg, Max and Simonyan, Karen and Vedaldi, Andrea and Zisserman, Andrew. IJCV, 2016 [paper]
  2. Synthetic data for text localisation in natural images. Gupta, Ankush and Vedaldi, Andrea and Zisserman, Andrew. CVPR, 2016 [paper] [code]
  3. TextBoxes: A Fast Text Detector with a Single Deep Neural Network. Liao, Minghui and Shi, Baoguang and Bai, Xiang and Wang, Xinggang and Liu, Wenyu. AAAI, 2017 [paper] [code]
2.3.2 Jointly Trained Two-Stage Methods
  1. SEE: Towards Semi-Supervised End-to-End Scene Text Recognition. Bartz, Christian and Yang, Haojin and Meinel, Christoph. 2017 [paper] [code]
  2. Deep TextSpotter: An End-To-End Trainable Scene Text Localization and Recognition Framework. Busta, Michal and Neumann, Lukas and Matas, Jiri. ICCV, 2017 [paper] [code]
  3. Towards End-To-End Text Spotting With Convolutional Recurrent Neural Networks. Li, Hui and Wang, Peng and Shen, Chunhua. ICCV, 2017 [paper]
  4. An End-to-End TextSpotter With Explicit Alignment and Attention. He, Tong and Tian, Zhi and Huang, Weilin and Shen, Chunhua and Qiao, Yu and Sun, Changming. CVPR, 2018 [paper]
  5. FOTS: Fast Oriented Text Spotting with a Unified Network. Liu, Xuebo and Liang, Ding and Yan, Shi and Chen, Dagui and Qiao, Yu and Yan, Junjie. CVPR, 2018 [paper]
  6. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Lyu, Pengyuan and Liao, Minghui and Yao, Cong and Wu, Wenhao and Bai, Xiang. ECCV, 2018 [paper]
  7. Towards Unconstrained End-to-End Text Spotting. Qin, Siyang and Bissacco, Alessandro and Raptis, Michalis and Fujii, Yasuhisa and Xiao, Ying. ICCV, 2019 [paper]
  8. TextDragon: An End-to-End Framework for Arbitrary Shaped Text Spotting. Feng, Wei and He, Wenhao and Yin, Fei and Zhang, Xu-Yao and Liu, Cheng-Lin. ICCV, 2019 [paper]
  9. Convolutional character networks. Xing, Linjie and Tian, Zhi and Huang, Weilin and Scott, Matthew R. ICCV, 2019 [paper]

2.4 Auxilliary Techs

2.4.1 Synthetic Data
  1. Synthetic data and artificial neural networks for natural scene text recognition. Jaderberg, Max and Simonyan, Karen and Vedaldi, Andrea and Zisserman, Andrew. NIPS, 2014 [paper]
  2. Synthetic data for text localisation in natural images. Gupta, Ankush and Vedaldi, Andrea and Zisserman, Andrew. CVPR, 2016 [paper] [code]
  3. Verisimilar Image Synthesis for Accurate Detection and Recognition of Texts in Scenes. Zhan, Fangneng and Lu, Shijian and Xue, Chuhui. ECCV, 2018 [paper] [code]
  4. UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World. Long, Shangbang and Yao, Cong. CVPR, 2020, [paper] [code]
2.4.2 Weak/Semi-Supervision
  1. Wetext: Scene text detection under weak supervision. Tian, Shangxuan and Lu, Shijian and Li, Chongshou. ICCV, 2017 [paper]
  2. weakly supervised text attention network for generating text proposals in scene images. Rong, Li and MengYi, En and JianQiang, Li and HaiBin, Zhang. ICDAR, 2017 [paper]
  3. Wordsup: Exploiting word annotations for character based text detection. Hu, Han and Zhang, Chengquan and Luo, Yuxuan and Wang, Yuzhuo and Han, Junyu and Ding, Errui. ICCV, 2018 [paper]
  4. Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning. Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo. ICCV, 2019 [paper]
2.4.3 Deblurring
  1. Convolutional neural networks for direct text deblurring. Hradis, Michal and Kotera, Jan and Zemcik, Pavel and Sroubek, Filip. BMVC, 2015 [paper] [code]
  2. A blind deconvolution model for scene text detection and recognition in video. Khare, Vijeta and Shivakumara, Palaiahnakote and Raveendran, Paramesran and Blumenstein, Michael. PR, 2016 [paper]
2.4.4 Context Information
  1. Could scene context be beneficial for scene text detection? Zhu, Anna and Gao, Renwu and Uchida, Seiichi. PR, 2016 [paper]
2.4.5 Adversarial Attack
  1. Adaptive Adversarial Attack on Scene Text Recognition. Yuan, Xiaoyong and He, Pan and Li, Xiaolin Andy. 2018 [paper]
2.4.6 Evaluation
  1. Tightness-Aware Evaluation Protocol for Scene Text Detection. Yuliang Liu, Lianwen Jin, Zecheng Xie, Canjie Luo, Shuaitao Zhang, Lele Xie. CVPR 2019 [paper]

III. Datasets

Dataset (Year) Image Num (train/test) Text Num (train/test) Orientation Language Characteristics Detec/Recog Task
End2End ==== ==== ==== ==== ==== ====
ICDAR03 (2003) 509 (258/251) 2276 (1110/1156) Horizontal En - ✓/✓
ICDAR13 Scene Text(2013) 462 (229/233) - (848/1095) Horizontal En - ✓/✓
ICDAR15 Incidental Text(2015) 1500 (1000/500) - (-/-) Multi-Oriented En Blur, Small, Defocused ✓/✓
ICDAR17 / RCTW (2017) 12263 (8034/4229) - (-/-) Multi-Oriented Cn - ✓/✓
Total-Text (2017) 1555 (1255/300) - (-/-) Multi-Oriented, Curved En, Cn Irregular polygon label ✓/✓
SVT (2010) 350 (100/250) 904 (257/647) Horizontal En - ✓/✓
KAIST (2010) 3000 (-/-) 5000 (-/-) Horizontal En, Ko Distorted ✓/✓
NEOCR (2011) 659 (-/-) 5238 (-/-) Multi-oriented 8 langs - ✓/✓
CUTE (2014) or here 80 (-/80) - (-/-) Curved En - ✓/✓
CTW (2017) 32K ( 25K/6K) 1M ( 812K/205K) Multi-Oriented Cn Fine-grained annotation ✓/✓
CASIA-10K (2018) 10K (7K/3K) - (-/-) Multi-Oriented Cn ✓/✓
Detection Only ==== ==== ==== ==== ==== ====
OSTD (2011) 89 (-/-) 218 (-/-) Multi-oriented En - ✓/-
MSRA-TD500 (2012) 500 (300/200) 1719 (1068/651) Multi-Oriented En, Cn Long text ✓/-
HUST-TR400 (2014) 400 (400/-) - (-/-) Multi-Oriented En, Cn Long text ✓/-
ICDAR17 / RRC-MLT (2017) 18000 (9000/9000) - (-/-) Multi-Oriented 9 langs - ✓/-
CTW1500 (2017) 1500 (1000/500) - (-/-) Multi-Oriented, Curved En Bounding box with 14 vertexes ✓/-
Recognition Only ==== ==== ==== ==== ==== ====
Char74k (2009) 74107 (-/-) 74107 (-/-) Horizontal En, Kannada Character label -/✓
IIIT 5K-Word (2012) 5000 (-/-) 5000 (2000/3000) Horizontal - cropped -/✓
SVHN (2010) - (-/-) 600000 (-/-) Horizontal - House number digits -/✓
SVTP (2013) 639 (-/639) - (-/-) En Distorted -/✓
Owner
Shangbang Long
Shangbang Long
基于openpose和图像分类的手语识别项目

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

20 Dec 15, 2022
OpenGait is a flexible and extensible gait recognition project

A flexible and extensible framework for gait recognition. You can focus on designing your own models and comparing with state-of-the-arts easily with the help of OpenGait.

Shiqi Yu 335 Dec 22, 2022
Official implementation of Character Region Awareness for Text Detection (CRAFT)

CRAFT: Character-Region Awareness For Text detection Official Pytorch implementation of CRAFT text detector | Paper | Pretrained Model | Supplementary

Clova AI Research 2.5k Jan 03, 2023
This project is basically to draw lines with your hand, using python, opencv, mediapipe.

Paint Opencv 📷 This project is basically to draw lines with your hand, using python, opencv, mediapipe. Screenshoots 📱 Tools ⚙️ Python Opencv Mediap

Williams Ismael Bobadilla Torres 3 Nov 17, 2021
Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)

Optical Character Recognition OCR (Optical Character Recognition) is a technology that enables the conversion of document types such as scanned paper

21 Dec 25, 2022
OCR engine for all the languages

Description kraken is a turn-key OCR system optimized for historical and non-Latin script material. kraken's main features are: Fully trainable layout

431 Jan 04, 2023
Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Morphologycal-edge-detection-using-erosion-and-dialation the task is to detect object boundary using erosion or dialation . Here, use the kernel or st

Tamzid hasan 3 Nov 25, 2022
Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream vid

Peace 10 Jun 30, 2021
Crop regions in napari manually

napari-crop Crop regions in napari manually Usage Create a new shapes layer to annotate the region you would like to crop: Use the rectangle tool to a

Robert Haase 4 Sep 29, 2022
Extract tables from scanned image PDFs using Optical Character Recognition.

ocr-table This project aims to extract tables from scanned image PDFs using Optical Character Recognition. Install Requirements Tesseract OCR sudo apt

Abhijeet Singh 209 Dec 06, 2022
MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF

MeshToGeotiff - A fast Python algorithm to convert a 3D mesh into a GeoTIFF Python class for converting (very fast) 3D Meshes/Surfaces to Raster DEMs

8 Sep 10, 2022
This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

Amazon Web Services - Labs 422 Jan 03, 2023
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
Kornia is a open source differentiable computer vision library for PyTorch.

Open Source Differentiable Computer Vision Library

kornia 7.6k Jan 06, 2023
Motion Detection Squid Game with OpenCV Python

*Motion Detection Squid Game with OpenCV Python i am newbie in python. In this project I made a simple game to follow the trend about the red light gr

Nayan 17 Nov 22, 2022
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Jan 04, 2023
Pixel art search engine for opengameart

Pixel Art Reverse Image Search for OpenGameArt What does the final search look like? The final search with an example can be found here. It looks like

Eivind Magnus Hvidevold 92 Nov 06, 2022
Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

Rafik Boudalia 2 Oct 13, 2021
This is a real life mario project using python and mediapipe

real-life-mario This is a real life mario project using python and mediapipe How to run to run this just run - realMario.py file requirements This req

Programminghut 42 Dec 22, 2022
Python Computer Vision from Scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both f

Milaan Parmar / Милан пармар / _米兰 帕尔马 221 Dec 26, 2022