A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Last update: Jan 03, 2023

Overview

SVHNClassifier-PyTorch

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

If you're interested in C++ inference, move HERE

Results

Steps	GPU	Batch Size	Learning Rate	Patience	Decay Step	Decay Rate	Training Speed (FPS)	Accuracy
54000	GTX 1080 Ti	512	0.16	100	625	0.9	~1700	95.65%

Sample

$ python infer.py -c=./logs/model-54000.pth ./images/test-75.png
length: 2
digits: 7 5 10 10 10

$ python infer.py -c=./logs/model-54000.pth ./images/test-190.png
length: 3
digits: 1 9 0 10 10

Loss

Requirements

Python 3.6
torch 1.0
torchvision 0.2.1
visdom
```
$ pip install visdom
```

h5py

In Ubuntu:
$ sudo apt-get install libhdf5-dev
$ sudo pip install h5py

protobuf
```
$ pip install protobuf
```
lmdb
```
$ pip install lmdb
```

Setup

Clone the source code

$ git clone https://github.com/potterhsu/SVHNClassifier-PyTorch
$ cd SVHNClassifier-PyTorch

Download SVHN Dataset format 1

Extract to data folder, now your folder structure should be like below:

SVHNClassifier
    - data
        - extra
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - test
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat
        - train
            - 1.png 
            - 2.png
            - ...
            - digitStruct.mat

Usage

(Optional) Take a glance at original images with bounding boxes
```
Open `draw_bbox.ipynb` in Jupyter
```

Convert to LMDB format

$ python convert_to_lmdb.py --data_dir ./data

(Optional) Test for reading LMDBs

Open `read_lmdb_sample.ipynb` in Jupyter

Train

$ python train.py --data_dir ./data --logdir ./logs

Retrain if you need

$ python train.py --data_dir ./data --logdir ./logs_retrain --restore_checkpoint ./logs/model-100.pth

Evaluate

$ python eval.py --data_dir ./data ./logs/model-100.pth

Visualize

$ python -m visdom.server
$ python visualize.py --logdir ./logs

Infer

$ python infer.py --checkpoint=./logs/model-100.pth ./images/test1.png

Clean

$ rm -rf ./logs
or
$ rm -rf ./logs_retrain

A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

Related tags

Overview

SVHNClassifier-PyTorch

Results

Sample

Loss

Requirements

Setup

Usage

Owner

Potter Hsu

Official implementation of the PICASO: Permutation-Invariant Cascaded Attentional Set Operator

Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

Predicts an answer in yes or no.

[SIGGRAPH Asia 2019] Artistic Glyph Image Synthesis via One-Stage Few-Shot Learning

Open source annotation tool for machine learning practitioners.

PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

This repository contains the code for our paper VDA (public in EMNLP2021 main conference)

Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

Tutorial in Python targeted at Epidemiologists. Will discuss the basics of analysis in Python 3

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

Intent parsing and slot filling in PyTorch with seq2seq + attention

Fast, Attemptable Route Planner for Navigation in Known and Unknown Environments

A framework for multi-step probabilistic time-series/demand forecasting models

Lucid Sonic Dreams syncs GAN-generated visuals to music.

A library for low-memory inferencing in PyTorch.

A PyTorch Implementation of Single Shot Scale-invariant Face Detector.

Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"

Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR, 2019)

Code for Massive-scale Decoding for Text Generation using Lattices