A novel region proposal network for more general object detection ( including scene text detection ).

Overview

DeRPN: Taking a further step toward more general object detection

DeRPN is a novel region proposal network which concentrates on improving the adaptivity of current detectors. The paper is available here.

Recent Update

· Mar. 13, 2019: The DeRPN pretrained models are added.

· Jan. 25, 2019: The code is released.

Contact Us

Welcome to improve DeRPN together. For any questions, please feel free to contact Lele Xie ([email protected]) or Prof. Jin ([email protected]).

Citation

If you find DeRPN useful to your research, please consider citing our paper as follow:

@article{xie2019DeRPN,
  title     = {DeRPN: Taking a further step toward more general object detection},
  author    = {Lele Xie, Yuliang Liu, Lianwen Jin*, Zecheng Xie}
  joural    = {AAAI}
  year      = {2019}
}

Main Results

Note: The reimplemented results are slightly different from those presented in the paper for different training settings, but the conclusions are still consistent. For example, this code doesn't use multi-scale training which should boost the results for both DeRPN and RPN.

COCO-Text

training data: COCO-Text train

test data: COCO-Text test

network [email protected] [email protected] [email protected] [email protected]
RPN+Faster R-CNN VGG16 32.48 52.54 7.40 17.59
DeRPN+Faster R-CNN VGG16 47.39 70.46 11.05 25.12
RPN+R-FCN ResNet-101 37.71 54.35 13.17 22.21
DeRPN+R-FCN ResNet-101 48.62 71.30 13.37 27.57

Pascal VOC

training data: VOC 07+12 trainval

test data: VOC 07 test

Inference time is evaluated on one TITAN XP GPU.

network inference time [email protected] [email protected] AP
RPN+Faster R-CNN VGG16 64 ms 75.53 42.08 42.60
DeRPN+Faster R-CNN VGG16 65 ms 76.17 44.97 43.84
RPN+R-FCN ResNet-101 85 ms 78.87 54.30 50.04
DeRPN+R-FCN (900) * ResNet-101 84 ms 79.21 54.43 50.28

( "*": On Pascal VOC dataset, we found that it is more suitable to train the DeRPN+R-FCN model with 900 proposals. For other experiments, we use the default proposal number to train the models, i.e., 2000 proposals fro Faster R-CNN, 300 proposals for R-FCN. )

MS COCO

training data: COCO 2017 train

test data: COCO 2017 test/val

test set network AP AP50 AP75 APS APM APL
RPN+Faster R-CNN VGG16 24.2 45.4 23.7 7.6 26.6 37.3
DeRPN+Faster R-CNN VGG16 25.5 47.2 25.2 10.3 27.9 36.7
RPN+R-FCN ResNet-101 27.7 47.9 29.0 10.1 30.2 40.1
DeRPN+R-FCN ResNet-101 28.4 49.0 29.5 11.1 31.7 40.5
val set network AP AP50 AP75 APS APM APL
RPN+Faster R-CNN VGG16 24.1 45.0 23.8 7.6 27.8 37.8
DeRPN+Faster R-CNN VGG16 25.5 47.3 25.0 9.9 28.8 37.8
RPN+R-FCN ResNet-101 27.8 48.1 28.8 10.4 31.2 42.5
DeRPN+R-FCN ResNet-101 28.4 48.5 29.5 11.5 32.9 42.0

Getting Started

  1. Requirements
  2. Installation
  3. Preparation for Training & Testing
  4. Usage

Requirements

  1. Cuda 8.0 and cudnn 5.1.
  2. Some python packages: cython, opencv-python, easydict et. al. Simply install them if your system misses these packages.
  3. Configure the caffe according to your environment (Caffe installation instructions). As the code requires pycaffe, caffe should be built with python layers. In Makefile.config, make sure to uncomment this line:
WITH_PYTHON_LAYER := 1
  1. An NVIDIA GPU with more than 6GB is required for ResNet-101.

Installation

  1. Clone the DeRPN repository

    git clone https://github.com/HCIILAB/DeRPN.git
    
  2. Build the Cython modules

    cd $DeRPN_ROOT/lib
    make
  3. Build caffe and pycaffe

    cd $DeRPN_ROOT/caffe
    make -j8 && make pycaffe

Preparation for Training & Testing

Dataset

  1. Download the datasets of Pascal VOC 2007 & 2012, MS COCO 2017 and COCO-Text.

  2. You need to put these datasets under the $DeRPN_ROOT/data folder (with symlinks).

  3. For COCO-Text, the folder structure is as follow:

    $DeRPN_ROOT/data/coco_text/images/train2014
    $DeRPN_ROOT/data/coco_text/images/val2014
    $DeRPN_ROOT/data/coco_text/annotations  
    # train2014, val2014, and annotations are symlinks from /pth_to_coco2014/train2014, 
    # /pth_to_coco2014/val2014 and /pth_to_coco2014/annotations2014/, respectively.
  4. For COCO, the folder structure is as follow:

    $DeRPN_ROOT/data/coco/images/train2017
    $DeRPN_ROOT/data/coco/images/val2017
    $DeRPN_ROOT/data/coco/images/test-dev2017
    $DeRPN_ROOT/data/coco/annotations  
    # the symlinks are similar to COCO-Text
  5. For Pascal VOC, the folder structure is as follow:

    $DeRPN_ROOT/data/VOCdevkit2007
    $DeRPN_ROOT/data/VOCdevkit2012
    #VOCdevkit2007 and VOCdevkit2012 are symlinks from $VOCdevkit whcich contains VOC2007 and VOC2012.

Pretrained models

Please download the ImageNet pretrained models (VGG16 and ResNet-101, password: k4z1), and put them under

$DeRPN_ROOT/data/imagenet_models

We also provide the DeRPN pretrained models here (password: fsd8).

Usage

cd $DeRPN_ROOT
./experiments/scripts/faster_rcnn_derpn_end2end.sh [GPU_ID] [NET] [DATASET]

# e.g., ./experiments/scripts/faster_rcnn_derpn_end2end.sh 0 VGG16 coco_text

Copyright

This code is free to the academic community for research purpose only. For commercial purpose usage, please contact Dr. Lianwen Jin: [email protected].

Owner
Deep Learning and Vision Computing Lab, SCUT
Deep Learning and Vision Computing Lab, SCUT
A curated list of resources dedicated to scene text localization and recognition

Scene Text Localization & Recognition Resources A curated list of resources dedicated to scene text localization and recognition. Any suggestions and

CarlosTao 1.6k Dec 22, 2022
Repository for playing the computer vision apps: People analytics on Raspberry Pi.

play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA

eMHa 1 Sep 23, 2021
Code for CVPR2021 paper "Learning Salient Boundary Feature for Anchor-free Temporal Action Localization"

AFSD: Learning Salient Boundary Feature for Anchor-free Temporal Action Localization This is an official implementation in PyTorch of AFSD. Our paper

Tencent YouTu Research 146 Dec 24, 2022
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 08, 2023
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Bridging Video-text Retrieval with Multiple Choice Questions, CVPR 2022 (Oral) Paper | Project Page | Pre-trained Model | CLIP-Initialized Pre-trained

Applied Research Center (ARC), Tencent PCG 99 Jan 06, 2023
Image augmentation for machine learning experiments.

imgaug This python library helps you with augmenting images for your machine learning projects. It converts a set of input images into a new, much lar

Alexander Jung 13.2k Jan 02, 2023
OCR, Object Detection, Number Plate, Real Time

README.md PrePareded anaconda env requirements.txt clova AI → deep text recognition → trained weights (ex, .pth) wpod-net weights (ex, .h5 , .json) ht

Kaven Lee 7 Dec 06, 2022
Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

Maciej Budyś 170 Dec 27, 2022
Detect handwritten words in a text-line (classic image processing method).

Word segmentation Implementation of scale space technique for word segmentation as proposed by R. Manmatha and N. Srimal. Even though the paper is fro

Harald Scheidl 190 Jan 03, 2023
Discord QR Scam Code Generator + Token grab mobile device.

A Python script that automatically generates a Nitro scam QR code and grabs the Discord token when scanned.

Visual 9 Nov 22, 2022
LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Murtaza Hassan 815 Dec 29, 2022
Assignment work with webcam

work with webcam : Press key 1 to use emojy on your face Press key 2 to use lip and eye on your face Press key 3 to checkered your face Press key 4 to

Hanane Kheirandish 2 May 31, 2022
Brief idea about our project is mentioned in project presentation file.

Brief idea about our project is mentioned in project presentation file. You just have to run attendance.py file in your suitable IDE but we prefer jupyter lab.

Dhruv ;-) 3 Mar 20, 2022
([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

Nested-Co-teaching ([email protected]) Pytorch implementation of paper "Boosting Co-tea

YINGYI CHEN 41 Jan 03, 2023
fishington.io bot with OpenCV and NumPy

fishington.io-bot fishington.io bot with using OpenCV and NumPy bot can continue to fishing fully automatically how to use Open cmd in fishington.io-b

Bahadır Araz 77 Jan 02, 2023
GDB python tool to pretty print and debug c++ xtensor containers

gdb_xt2np GDB python tool to pretty print, examine, and debug c++ Xtensor containers. Xtensor is a c++ library for scientific computing using multidim

Christopher Burke 4 Oct 29, 2021
Lightning Fast Language Prediction 🚀

whatthelang Lightning Fast Language Prediction 🚀 Dependencies The dependencies can be installed using the requirements.txt file: $ pip install -r req

Indix 152 Oct 16, 2022
A list of hyperspectral image super-solution resources collected by Junjun Jiang

A list of hyperspectral image super-resolution resources collected by Junjun Jiang. If you find that important resources are not included, please feel free to contact me.

Junjun Jiang 301 Jan 05, 2023
color detection using python

colordetection color detection using python In this color detection Python project, we are going to build an application through which you can automat

Ruchith Kumar 1 Nov 04, 2021
Tracking the latest progress in Scene Text Detection and Recognition: Must-read papers well organized

SceneTextPapers Tracking the latest progress in Scene Text Detection and Recognition: must-read papers well organized Information about this repositor

Shangbang Long 763 Jan 01, 2023