A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Last update: Jan 03, 2023

Overview

SVHNClassifier-PyTorch

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

If you're interested in C++ inference, move HERE

Results

Steps	GPU	Batch Size	Learning Rate	Patience	Decay Step	Decay Rate	Training Speed (FPS)	Accuracy
54000	GTX 1080 Ti	512	0.16	100	625	0.9	~1700	95.65%

Sample

$ python infer.py -c=./logs/model-54000.pth ./images/test-75.png
length: 2
digits: 7 5 10 10 10

$ python infer.py -c=./logs/model-54000.pth ./images/test-190.png
length: 3
digits: 1 9 0 10 10

Loss

Requirements

Python 3.6
torch 1.0
torchvision 0.2.1
visdom
```
$ pip install visdom
```

h5py

In Ubuntu:
$ sudo apt-get install libhdf5-dev
$ sudo pip install h5py

protobuf
```
$ pip install protobuf
```
lmdb
```
$ pip install lmdb
```

Setup

Clone the source code

$ git clone https://github.com/potterhsu/SVHNClassifier-PyTorch
$ cd SVHNClassifier-PyTorch

Download SVHN Dataset format 1

Extract to data folder, now your folder structure should be like below:

SVHNClassifier
    - data
        - extra
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - test
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - train
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat

Usage

(Optional) Take a glance at original images with bounding boxes
```
Open `draw_bbox.ipynb` in Jupyter
```

Convert to LMDB format

$ python convert_to_lmdb.py --data_dir ./data

(Optional) Test for reading LMDBs

Open `read_lmdb_sample.ipynb` in Jupyter

Train

$ python train.py --data_dir ./data --logdir ./logs

Retrain if you need

$ python train.py --data_dir ./data --logdir ./logs_retrain --restore_checkpoint ./logs/model-100.pth

Evaluate

$ python eval.py --data_dir ./data ./logs/model-100.pth

Visualize

$ python -m visdom.server
$ python visualize.py --logdir ./logs

Infer

$ python infer.py --checkpoint=./logs/model-100.pth ./images/test1.png

Clean

$ rm -rf ./logs
or
$ rm -rf ./logs_retrain

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Related tags

Overview

SVHNClassifier-PyTorch

Results

Sample

Loss

Requirements

Setup

Usage

Owner

Potter Hsu

A simple Python configuration file operator.

The end-to-end platform for building voice products at scale

automated systems to assist guarding corona Virus precautions for Closed Rooms (e.g. Halls, offices, etc..)

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

Illuminated3D This project participates in the Nasa Space Apps Challenge 2021.

I-SECRET: Importance-guided fundus image enhancement via semi-supervised contrastive constraining

Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

Public repository containing materials used for Feed Forward (FF) Neural Networks article.

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

Neural network graphs and training metrics for PyTorch, Tensorflow, and Keras.

A toolset of Python programs for signal modeling and indentification via sparse semilinear autoregressors.

Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

ServiceX Transformer that converts flat ROOT ntuples into columnwise data

Research code for the paper "Variational Gibbs inference for statistical estimation from incomplete data".

A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Visualizing Yolov5's layers using GradCam

BoxInst: High-Performance Instance Segmentation with Box Annotations

End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)

Style transfer, deep learning, feature transform

Diverse Object-Scene Compositions For Zero-Shot Action Recognition