Single Shot Text Detector with Regional Attention

Overview

License

Single Shot Text Detector with Regional Attention

Introduction

SSTD is initially described in our ICCV 2017 spotlight paper.

A third-party implementation of SSTD + Focal Loss. Thanks, Ho taek Han

If you find it useful in your research, please consider citing:

@inproceedings{panhe17singleshot,
      Title   = {Single Shot Text Detector with Regional Attention},
      Author  = {He, Pan and Huang, Weilin and He, Tong and Zhu, Qile and Qiao, Yu and Li, Xiaolin},
      Note    = {Proceedings of Internatioanl Conference on Computer Vision (ICCV)},
      Year    = {2017}
      }
@inproceedings{panhe16readText,
      Title   = {Reading Scene Text in Deep Convolutional Sequences},
      Author  = {He, Pan and Huang, Weilin and Qiao, Yu and Loy, Chen Change and Tang, Xiaoou},
      Note    = {Proceedings of AAAI Conference on Artificial Intelligence, (AAAI)},
      Year    = {2016}
      }
@inproceedings{liu16ssd,
      Title   = {{SSD}: Single Shot MultiBox Detector},
      Author  = {Liu, Wei and Anguelov, Dragomir and Erhan, Dumitru and Szegedy, Christian and Reed, Scott and Fu, Cheng-Yang and Berg, Alexander C.},
      Note    = {Proceedings of European Conference on Computer Vision (ECCV)},
      Year    = {2016}
      }

Installation

  1. Get the code. We will call the directory that you cloned Caffe into $CAFFE_ROOT
git clone https://github.com/BestSonny/SSTD.git
cd SSTD
  1. Build the code. Please follow Caffe instruction to install all necessary packages and build it.
# Modify Makefile.config according to your Caffe installation.
cp Makefile.config.example Makefile.config
make -j8
# Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make py
make test -j8
# (Optional)
make runtest -j8
# build nms
cd examples/text
make
cd ..
  1. Run the demo code. Download Model google drive, baiduyun and put it in text/model folder
cd examples
sh text/download.sh
mkdir text/result
python text/demo_test.py
Owner
Pan He
Computer Vision Ph.D. Student @ UF MALT Lab
Pan He
Learning Camera Localization via Dense Scene Matching, CVPR2021

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Hua

tangshitao 65 Dec 01, 2022
([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

Nested-Co-teaching ([email protected]) Pytorch implementation of paper "Boosting Co-tea

YINGYI CHEN 41 Jan 03, 2023
Qrcode Attendence System with Opencv and Pyzbar

Setup process Creates a virtual environment (Scripts that ensure executed Python code uses the Python interpreter and site packages installed inside t

Ganesh 5 Aug 01, 2022
Text-to-Image generation

Generate vivid Images for Any (Chinese) text CogView is a pretrained (4B-param) transformer for text-to-image generation in general domain. Read our p

THUDM 1.3k Jan 05, 2023
Repository for playing the computer vision apps: People analytics on Raspberry Pi.

play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA

eMHa 1 Sep 23, 2021
A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

NRSC5-DUI is a graphical interface for nrsc5. It makes it easy to play your favorite FM HD radio stations using an RTL-SDR dongle. It will also displa

61 Dec 22, 2022
(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

BRNet Introduction This is a release of the code of our paper Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds,

86 Oct 05, 2022
Pixel art search engine for opengameart

Pixel Art Reverse Image Search for OpenGameArt What does the final search look like? The final search with an example can be found here. It looks like

Eivind Magnus Hvidevold 92 Nov 06, 2022
Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Bridging Video-text Retrieval with Multiple Choice Questions, CVPR 2022 (Oral) Paper | Project Page | Pre-trained Model | CLIP-Initialized Pre-trained

Applied Research Center (ARC), Tencent PCG 99 Jan 06, 2023
Amazing 3D explosion animation using Pygame module.

3D Explosion Animation 💣 💥 🔥 Amazing explosion animation with Pygame. 💣 Explosion physics An Explosion instance is made of a set of Particle objec

Dylan Tintenfich 12 Mar 11, 2022
Python bindings for JIGSAW: a Delaunay-based unstructured mesh generator.

JIGSAW: An unstructured mesh generator JIGSAW is an unstructured mesh generator and tessellation library; designed to generate high-quality triangulat

Darren Engwirda 26 Dec 13, 2022
An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive

Kushal Shingote 1 Feb 12, 2022
Regions sanitàries (RS), Sectors Sanitàris (SS) i Àrees Bàsiques de Salut (ABS) de Catalunya

Regions sanitàries (RS), Sectors Sanitaris (SS), Àrees de Gestió Assistencial (AGA) i Àrees Bàsiques de Salut (ABS) de Catalunya Fitxers GeoJSON de le

Glòria Macià Muñoz 2 Jan 23, 2022
Rest API Written In Python To Classify NSFW Images.

✨ NSFW Classifier API ✨ Rest API Written In Python To Classify NSFW Images. Fastest Solution If you don't want to selfhost it, there's already an inst

Akshay Rajput 23 Dec 30, 2022
Maze generator and solver with python

Procedural-Maze-Generator-Algorithms Check out my youtube channel : Auctux Ressources Thanks to Jamis Buck Book : Mazes for programmers Requirements P

Joseph 19 Dec 07, 2022
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
huoyijie 1.2k Dec 29, 2022
Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper.

EnergyExpenditure Code for the "Sensing leg movement enhances wearable monitoring of energy expenditure" paper. Additional data for replicating this s

Patrick S 42 Oct 26, 2022
Thresholding-and-masking-using-OpenCV - Image Thresholding is used for image segmentation

Image Thresholding is used for image segmentation. From a grayscale image, thresholding can be used to create binary images. In thresholding we pick a threshold T.

Grace Ugochi Nneji 3 Feb 15, 2022
Using computer vision method to recognize and calcutate the features of the architecture.

building-feature-recognition In this repository, we accomplished building feature recognition using traditional/dl-assisted computer vision method. Th

4 Aug 11, 2022