Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Last update: Nov 22, 2022

Related tags

Computer Vision RRPN

Overview

Paper source

Arbitrary-Oriented Scene Text Detection via Rotation Proposals

https://arxiv.org/abs/1703.01086

News

We update RRPN in pytorch 1.0! View https://github.com/mjq11302010044/RRPN_plusplus for more details. Text Spotter f-measure results are 89.5 % in IC15, 92.0% in IC13. The testing speed can reach 13.3 fps in IC13 with input shorter size of 640px !

License

RRPN is released under the MIT License (refer to the LICENSE file for details). This project is for research purpose only, further use for RRPN should contact authors.

Citing RRPN

If you find RRPN useful in your research, please consider citing:

@article{Jianqi17RRPN,
    Author = {Jianqi Ma and Weiyuan Shao and Hao Ye and Li Wang and Hong Wang and Yingbin Zheng and Xiangyang Xue},
    Title = {Arbitrary-Oriented Scene Text Detection via Rotation Proposals},
    journal = {IEEE Transactions on Multimedia},
    volume={20}, 
    number={11}, 
    pages={3111-3122}, 
    year={2018}
}

Requirements: software
Requirements: hardware
Basic installation
Demo
Beyond the demo: training and testing

Requirements: software

Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

You can download my Makefile.config for reference. 2. Python packages you might not have: cython, python-opencv, easydict

Requirements: hardware

For training the end-to-end version of RRPN with VGG16, 4~5G of GPU memory is sufficient (using CUDNN)

Installation (sufficient for the demo)

Clone the RRPN repository

# git clone https://github.com/mjq11302010044/RRPN.git

We'll call the directory that you cloned RRPN into RRPN_ROOT
Build the Cython modules
```
cd $RRPN_ROOT/lib
make
```

Build Caffe and pycaffe

cd $RRPN_ROOT/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
#   http://caffe.berkeleyvision.org/installation.html

# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j4 && make pycaffe

Download pre-computed RRPN detectors

Trained VGG16 model download link: https://drive.google.com/open?id=0B5rKZkZodGIsV2RJUjVlMjNOZkE

Then move the model into $RRPN_ROOT/data/faster_rcnn_models.

Demo

After successfully completing basic installation, you'll be ready to run the demo.

To run the demo

cd $RRPN_ROOT
python ./tools/rotation_demo.py

The txt results will be saved in $RRPN_ROOT/result

Beyond the demo: installation for training and testing models

You can use the function get_rroidb() in $RRPN_ROOT/lib/rotation/data_extractor.py to manage your training data:

Each training sample should be managed in a python dict like:

im_info = {
	'gt_classes': # Set to 1(Only text)
	'max_classes': # Set to 1(Only text)
	'image': # image path to access
	'boxes': # ground truth box
	'flipped' : # Flip an image or not (Not implemented)
	'gt_overlaps' : # overlap of a class(text)
	'seg_areas' : # area of an ground truth region
	'height': # height of an image data
	'width': # width of an image data
	'max_overlaps' : # max overlap with each gt-proposal
	'rotated': # Random angle to rotate an image
}

Then assign your database to the variable 'roidb' in main function of $RRPN_ROOT/tools/train_net.py

116: roidb = get_rroidb("train") # change to your data manage function

Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the networks described in the paper: VGG16.

cd $RRPN_ROOT
./data/scripts/fetch_imagenet_models.sh

VGG16 comes from the Caffe Model Zoo, but is provided here for your convenience. ZF was trained at MSRA.

Then you can train RRPN by typing:

./experiment/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] rrpn

[NET] usually takes VGG16

Trained RRPN networks are saved under:(We set the directory to './' by default.)

./

One can change the directory in variable output_dir in $RRPN_ROOT/tools/train_net.py

Any question about this project please send message to Jianqi Ma([email protected]), and enjoy it!

Source code of RRPN ---- Arbitrary-Oriented Scene Text Detection via Rotation Proposals

Related tags

Overview

Paper source

Arbitrary-Oriented Scene Text Detection via Rotation Proposals

News

License

Citing RRPN

Contents

Requirements: software

Requirements: hardware

Installation (sufficient for the demo)

Demo

Beyond the demo: installation for training and testing models

Download pre-trained ImageNet models

Owner

kaldi-asr/kaldi is the official location of the Kaldi project.

In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.

Framework for the Complete Gaze Tracking Pipeline

A python programusing Tkinter graphics library to randomize questions and answers contained in text files

Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

Distilling Knowledge via Knowledge Review, CVPR 2021

Rest API Written In Python To Classify NSFW Images.

A Python script to capture images from multiple webcams at once and save them into your local machine

A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

A python screen recorder for low-end computers, provides high quality video output.

Lightning Fast Language Prediction 🚀

Opencv-image-filters - A camera to capture videos in real time by placing filters using Python with the help of the Tkinter and OpenCV libraries

Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

Primary QPDF source code and documentation

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

OpenGait is a flexible and extensible gait recognition project

An organized collection of tutorials and projects created for aspriring computer vision students.

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car