a micro OCR network with 0.07mb params.

Last update: Aug 06, 2022

Related tags

Overview

MicroOCR

a micro OCR network with 0.07mb params.

    Layer (type)               Output Shape         Param #

        Conv2d-1            [-1, 64, 8, 32]           3,136
   BatchNorm2d-2            [-1, 64, 8, 32]             128
          GELU-3            [-1, 64, 8, 32]               0
     ConvBNACT-4            [-1, 64, 8, 32]               0
        Conv2d-5            [-1, 64, 8, 32]             640
   BatchNorm2d-6            [-1, 64, 8, 32]             128
          GELU-7            [-1, 64, 8, 32]               0
     ConvBNACT-8            [-1, 64, 8, 32]               0
        Conv2d-9            [-1, 64, 8, 32]           4,160
  BatchNorm2d-10            [-1, 64, 8, 32]             128
         GELU-11            [-1, 64, 8, 32]               0
    ConvBNACT-12            [-1, 64, 8, 32]               0
   MicroBlock-13            [-1, 64, 8, 32]               0
       Conv2d-14            [-1, 64, 8, 32]             640
  BatchNorm2d-15            [-1, 64, 8, 32]             128
         GELU-16            [-1, 64, 8, 32]               0
    ConvBNACT-17            [-1, 64, 8, 32]               0
       Conv2d-18            [-1, 64, 8, 32]           4,160
  BatchNorm2d-19            [-1, 64, 8, 32]             128
         GELU-20            [-1, 64, 8, 32]               0
    ConvBNACT-21            [-1, 64, 8, 32]               0
   MicroBlock-22            [-1, 64, 8, 32]               0
      Flatten-23              [-1, 64, 256]               0
AdaptiveAvgPool1d-24           [-1, 64, 30]               0
       Linear-25               [-1, 30, 60]           3,900

Total params: 17,276
Trainable params: 17,276
Non-trainable params: 0
Input size (MB): 0.05
Forward/backward pass size (MB): 2.90
Params size (MB): 0.07
Estimated Total Size (MB): 3.02

Script Description

MicroOCR
├── README.md                                   # Descriptions about MicroNet
├── collatefn.py                                # collatefn
├── ctc_label_converter.py                      # accuracy metric for MicroNet
├── dataset.py                                  # Data preprocessing for training and evaluation
├── demo.py                                     # demo
├── gen_image.py                                # generate image for train and eval
├── infer_tool.py                               # inference tool
├── keys.py                                     # character
├── loss.py                                     # Ctcloss definition
├── metric.py                                   # accuracy metric for MicroNet
├── model.py                                    # MicroNet
├── train.py                                    # train the model

Generate data for train and eval

python gen_image.py

Training

python train.py

Inference

python demo.py

a micro OCR network with 0.07mb params.

Related tags

Overview

MicroOCR

Script Description

Generate data for train and eval

Training

Inference

Owner

william

An unofficial package help developers to implement ZATCA (Fatoora) QR code easily which required for e-invoicing

Qrcode Attendence System with Opencv and Pyzbar

Play the Namibian game of Owela against a terrible AI. Built using Django and htmx.

A simple demo program for using OpenCV on Android

Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

SRA's seminar on Introduction to Computer Vision Fundamentals

Provides OCR (Optical Character Recognition) services through web applications

CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

This tool will help you convert your text to handwriting xD

Optical character recognition for Japanese text, with the main focus being Japanese manga

CRAFT-Pyotorch：Character Region Awareness for Text Detection Reimplementation for Pytorch

pulse2percept: A Python-based simulation framework for bionic vision

Amazing 3D explosion animation using Pygame module.

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Using computer vision method to recognize and calcutate the features of the architecture.

PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Implement 'Single Shot Text Detector with Regional Attention, ICCV 2017 Spotlight'

Camelot: PDF Table Extraction for Humans

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

A novel region proposal network for more general object detection ( including scene text detection ).