Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Overview

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

overview

This material is supplementray code for paper accepted in ICDAR 2021

  1. We highly recommend to use docker image because our model contains custom operation which depends on framework and cuda version.
  2. We provide trained model for ICDAR 2017, 2013 which is in final_checkpoint_ch8 and for ICDAR 2015 which is in final_checkpoint_ch4
  3. This code is mainly focused on inference. To train our model, training gpu like V100 is needed. please check our paper in detail.

REQUIREMENT

  1. Nvidia-docker
  2. Tensorflow 1.14
  3. Miminum GPU requirement : NVIDIA GTX 1080TI

INSTALLATION

  • Make docker image and container
docker build --tag rbimage ./dockerfile
docker run --runtime=nvidia --name rbcontainer -v /rotated-box-is-back-path:/rotated-box-is-back -i -t rbimage /bin/bash
  • build custom operations in container
cd /rotated-box-is-back/nms 
cmake ./
make
./shell.sh

SAMPLE IMAGE INFERENCE

cd /rotated-box-is-back/
python viz.py --test_data_path=./sample --checkpoint_path=./final_checkpoint_ch8 --output_dir=./sample_result  --thres 0.6 --min_size=1600 --max_size=2000

ICDAR 2017 INFERENCE

  1. please replace icdar_testset_path to your-icdar-2017-testset-folder path.
python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic17  --thres 0.6 --min_size=1600 --max_size=2000

ICDAR 2015 INFERENCE

  1. please replace icdar_testset_path to your-icdar-2015-testset-folder path.
  2. To converting evalutation format. Convert result text file like below
python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch4 --output_dir=./ic15  --thres 0.7 --min_size=1100 --max_size=2000
python text_postprocessing.py -i=./ic15/ -o=./ic15_format/ -e True

ICDAR 2013 INFERENCE

  1. please replace icdar_testset_path to your-icdar-2013-testset-folder path.
  2. To converting evalutation format. Convert result text file like below
python viz.py --test_data_path=icdar_testset_path --checkpoint_path=./final_checkpoint_ch8 --output_dir=./ic13  --thres 0.55 --min_size=700 --max_size=900
python text_postprocessing.py -i=./ic13/ -o=./ic13_format/ -e True -m rec

EVALUATION TABLE

IC13 IC15 IC17
P R F P R F P R F
95.9 89.1 92.4 89.7 84.2 86.9 83.4 68.2 75.0

TRAINING

  1. It can be trained below command line
python train_refine_estimator.py --input_size=1024 --batch_size=2 --checkpoint_path=./finetuning --training_data_path=your-image-path --training_gt_path=your-gt-path  --learning_rate=0.00001 --max_epochs=500  --save_summary_steps=1000 --warmup_path=./final_checkpoint_ch8

ACKNOWLEDGEMENT

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 1711125972, Audio-Visual Perception for Autonomous Rescue Drones).

CITATION

If you found it is helpfull for your research, please cite:

Lee J., Lee J., Yang C., Lee Y., Lee J. (2021) Rotated Box Is Back: An Accurate Box Proposal Network for Scene Text Detection. In: Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science, vol 12824. Springer, Cham. https://doi.org/10.1007/978-3-030-86337-1_4

Owner
NCSOFT
NCSOFT Open Sources
NCSOFT
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Back to the Feature with PixLoc We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via

Computer Vision and Geometry Lab 610 Jan 05, 2023
Differential rendering based motion capture blender project.

TraceArmature Summary TraceArmature is currently a set of python scripts that allow for high fidelity motion capture through the use of AI pose estima

William Rodriguez 4 May 27, 2022
Self-Supervised CNN-GCN Autoencoder

GCNDepth Self-Supervised CNN-GCN Autoencoder GCNDepth: Self-supervised monocular depth estimation based on graph convolutional network To be published

53 Dec 14, 2022
Time Delayed NN implemented in pytorch

Pytorch Time Delayed NN Time Delayed NN implemented in PyTorch. Usage kernels = [(1, 25), (2, 50), (3, 75), (4, 100), (5, 125), (6, 150)] tdnn = TDNN

Daniil Gavrilov 79 Aug 04, 2022
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)

Skyformer This repository is the official implementation of Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr"om Method (NeurIPS 2021).

Qi Zeng 46 Sep 20, 2022
This repository contains the code and models for the following paper.

DC-ShadowNet Introduction This is an implementation of the following paper DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised

AuAgCu 65 Dec 27, 2022
Config files for my GitHub profile.

Canalyst Candas Data Science Library Name Canalyst Candas Description Built by a former PM / analyst to give anyone with a little bit of Python knowle

Canalyst Candas 13 Jun 24, 2022
Finding all things on-prem Microsoft for password spraying and enumeration.

msprobe About Installing Usage Examples Coming Soon Acknowledgements About Finding all things on-prem Microsoft for password spraying and enumeration.

205 Jan 09, 2023
RoFormer_pytorch

PyTorch RoFormer 原版Tensorflow权重(https://github.com/ZhuiyiTechnology/roformer) chinese_roformer_L-12_H-768_A-12.zip (提取码:xy9x) 已经转化为PyTorch权重 chinese_r

yujun 283 Dec 12, 2022
JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces

JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces JAXMAPP is a JAX-based library for multi-agent path planning (MAPP) in c

OMRON SINIC X 24 Dec 28, 2022
FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows.

Meta Incubator 272 Jan 02, 2023
Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads-Tutorial-3 Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads Inc 2 Jan 03, 2022
Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

SCGAN Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer" Prepare The pre-trained model is avaiable at http

118 Dec 12, 2022
Code for the AAAI-2022 paper: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification (AAAI 2022) Prerequisite PyTorch = 1.2.0 P

16 Dec 14, 2022
A TensorFlow implementation of the Mnemonic Descent Method.

MDM A Tensorflow implementation of the Mnemonic Descent Method. Mnemonic Descent Method: A recurrent process applied for end-to-end face alignment G.

123 Oct 07, 2022
Libraries, tools and tasks created and used at DeepMind Robotics.

dm_robotics: Libraries, tools, and tasks created and used for Robotics research at DeepMind. Package overview Package Summary Transformations Rigid bo

DeepMind 273 Jan 06, 2023
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

SELF-ATTENTIVE VAD: CONTEXT-AWARE DETECTION OF VOICE FROM NOISE (ICASSP 2021) Pytorch implementation of SELF-ATTENTIVE VAD | Paper | Dataset Yong Rae

97 Dec 23, 2022
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Junbin Xiao 50 Nov 24, 2022
TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video

TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video Timely handgun detection is a cr

Mario Duran-Vega 18 Dec 26, 2022
Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

naqs-for-quantum-chemistry This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio qua

Tom Barrett 24 Dec 23, 2022