BankNote-Net: Open dataset and encoder model for assistive currency recognition

Overview

BankNote-Net: Open Dataset for Assistive Currency Recognition

Millions of people around the world have low or no vision. Assistive software applications have been developed for a variety of day-to-day tasks, including currency recognition. To aid with this task, we present BankNote-Net, an open dataset for assistive currency recognition. The dataset consists of a total of 24,816 embeddings of banknote images captured in a variety of assistive scenarios, spanning 17 currencies and 112 denominations. These compliant embeddings were learned using supervised contrastive learning and a MobileNetV2 architecture, and they can be used to train and test specialized downstream models for any currency, including those not covered by our dataset or for which only a few real images per denomination are available (few-shot learning). We deploy a variation of this model for public use in the last version of the Seeing AI app developed by Microsoft, which has over a 100 thousand monthly active users.

If you make use of this dataset or pre-trained model in your own project, please consider referencing this GitHub repository and citing our paper:

@article{oviedoBankNote-Net2022,
  title   = {BankNote-Net: Open Dataset for Assistive Currency Recognition},
  author  = {Felipe Oviedo, Srinivas Vinnakota, Eugene Seleznev, Hemant Malhotra, Saqib Shaikh & Juan Lavista Ferres},
  journal = {https://arxiv.org/pdf/2204.03738.pdf},
  year    = {2022},
}

Data Structure

The dataset data structure consists of 256-dimensional vector embeddings with additional columns for currency, denomination and face labels, as explained in the data exploration notebook. The dataset is saved as 24,826 x 258 flat table in feather and csv file formats. Figure 1 presents some of these learned embeddings.

Figure 1: t-SNE representations of the BankNote-Net embeddings for a few selected currencies.

Setup and Dataset Usage

  1. Install requirements.

    Please, use the conda environment file env.yaml to install the right dependencies.

    # Create conda environment
    conda create env -f env.yaml
    
    # Activate environment to run examples
    conda activate banknote_net
    
  2. Example 1: Train a shallow classifier directly from the dataset embeddings for a currency available in the dataset. For inference, images should be encoded first using the keras MobileNet V2 pre-trained encoder model.

    Run the following file from root: train_from_embedding.py

    python src/train_from_embedding.py --currency AUD --bsize 128 --epochs 25 --dpath ./data/banknote_net.feather
    
      usage: train_from_embedding.py [-h] --currency
                                  {AUD,BRL,CAD,EUR,GBP,INR,JPY,MXN,PKR,SGD,TRY,USD,NZD,NNR,MYR,IDR,PHP}
                                  [--bsize BSIZE] [--epochs EPOCHS]
                                  [--dpath DPATH]
    
      Train model from embeddings.
    
      optional arguments:
      -h, --help            show this help message and exit
      --currency {AUD,BRL,CAD,EUR,GBP,INR,JPY,MXN,PKR,SGD,TRY,USD,NZD,NNR,MYR,IDR,PHP}, --c {AUD,BRL,CAD,EUR,GBP,INR,JPY,MXN,PKR,SGD,TRY,USD,NZD,NNR,MYR,IDR,PHP}
                              String of currency for which to train shallow
                              classifier
      --bsize BSIZE, --b BSIZE
                              Batch size for shallow classifier
      --epochs EPOCHS, --e EPOCHS
                              Number of epochs for training shallow top classifier
      --dpath DPATH, --d DPATH
                              Path to .feather BankNote Net embeddings
                          
    
  3. Example 2: Train a classifier on top of the BankNote-Net pre-trained encoder model using images in a custom directory. Input images must be of size 224 x 224 pixels and have square aspect ratio. For this example, we use a couple dozen images spanning 8 classes for Swedish Krona, structured as in the example_images/SEK directory, that contains both training and validation images.

    Run the following file from root: train_custom.py

    python src/train_custom.py --bsize 4 --epochs 25 --data_path ./data/example_images/SEK/ --enc_path ./models/banknote_net_encoder.h5
    
    usage: train_custom.py [-h] [--bsize BSIZE] [--epochs EPOCHS]
                      [--data_path DATA_PATH] [--enc_path ENC_PATH]
    
    Train model from custom image folder using pre-trained BankNote-Net encoder.
    
    optional arguments:
    -h, --help            show this help message and exit
    --bsize BSIZE, --b BSIZE
                          Batch size
    --epochs EPOCHS, --e EPOCHS
                          Number of epochs for training shallow top classifier.
    --data_path DATA_PATH, --data DATA_PATH
                          Path to folder with images.
    --enc_path ENC_PATH, --enc ENC_PATH
                          Path to .h5 file of pre-trained encoder model.                       
    
  4. Example 3: Perform inference using the SEK few-shot classifier of Example 2, and the validation images on example_images/SEK/val

    Run the following file from root: predict_custom.py, returns encoded predictions.

      python src/predict_custom.py --bsize 1 --data_path ./data/example_images/SEK/val/ --model_path ./src/trained_models/custom_classifier.h5
    
      usage: predict_custom.py [-h] [--bsize BSIZE] [--data_path DATA_PATH]
                              [--model_path MODEL_PATH]
    
      Perform inference using trained custom classifier.
    
      optional arguments:
      -h, --help            show this help message and exit
      --bsize BSIZE, --b BSIZE
                              Batch size
      --data_path DATA_PATH, --data DATA_PATH
                              Path to custom folder with validation images.
      --model_path MODEL_PATH, --enc MODEL_PATH
                              Path to .h5 file of trained classification model.                           
    

License for Dataset and Model

Copyright (c) Microsoft Corporation. All rights reserved.

The dataset is open for anyone to use under the CDLA-Permissive-2.0 license. The embeddings should not be used to reconstruct high resolution banknote images.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

COCON_ICLR2021 This is our Pytorch implementation of COCON. CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021) Alvin Chan, Y

alvinchangw 79 Dec 18, 2022
Mengzi Pretrained Models

中文 | English Mengzi 尽管预训练语言模型在 NLP 的各个领域里得到了广泛的应用,但是其高昂的时间和算力成本依然是一个亟需解决的问题。这要求我们在一定的算力约束下,研发出各项指标更优的模型。 我们的目标不是追求更大的模型规模,而是轻量级但更强大,同时对部署和工业落地更友好的模型。

Langboat 424 Jan 04, 2023
Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning The predictive learning of spatiotemporal sequences aims to generate future

THUML: Machine Learning Group @ THSS 243 Dec 26, 2022
TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline.

TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline

193 Dec 22, 2022
Devkit for 3D -- Some utils for 3D object detection based on Numpy and Pytorch

D3D Devkit for 3D: Some utils for 3D object detection and tracking based on Numpy and Pytorch Please consider siting my work if you find this library

Jacob Zhong 27 Jul 07, 2022
Learning with Subset Stacking

Learning with Subset Stacking (LESS) LESS is a new supervised learning algorithm that is based on training many local estimators on subsets of a given

S. Ilker Birbil 19 Oct 04, 2022
GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

GeneDisco is a benchmark suite for evaluating active learning algorithms for experimental design in drug discovery.

22 Dec 12, 2022
Framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample resolution

Sample-specific Bayesian Networks A framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample or per-patient re

Caleb Ellington 1 Sep 23, 2022
Generative Adversarial Text-to-Image Synthesis

###Generative Adversarial Text-to-Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee This is the

Scott Ellison Reed 883 Dec 31, 2022
領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

image-capture-class-annotation 領域を指定し、キーを入力することで画像を保存するツールです。 クラス分類用のデータセット作成を想定しています。 Requirement OpenCV 3.4.2 or later Usage 実行方法は以下です。 起動後はマウスクリック4

KazuhitoTakahashi 5 May 28, 2021
Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation

UTNet (Accepted at MICCAI 2021) Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation Introduction Transf

110 Jan 01, 2023
The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Swin-Unet The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"(https://arxiv.org/abs/2105.05537). A validatio

869 Jan 07, 2023
This code implements constituency parse tree aggregation

README This code implements constituency parse tree aggregation. Folder details code: This folder contains the code that implements constituency parse

Adithya Kulkarni 0 Oct 11, 2021
This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

Code Repository for the Paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (To appear in: Proceedings of NeurIPS20

1 Oct 03, 2022
Code for the paper "Adversarial Generator-Encoder Networks"

This repository contains code for the paper "Adversarial Generator-Encoder Networks" (AAAI'18) by Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. Pr

Dmitry Ulyanov 279 Jun 26, 2022
Proposed n-stage Latent Dirichlet Allocation method - A Novel Approach for LDA

n-stage Latent Dirichlet Allocation (n-LDA) Proposed n-LDA & A Novel Approach for classical LDA Latent Dirichlet Allocation (LDA) is a generative prob

Anıl Güven 4 Mar 07, 2022
Malware Bypass Research using Reinforcement Learning

Malware Bypass Research using Reinforcement Learning

Bobby Filar 76 Dec 26, 2022
Reinforcement Learning via Supervised Learning

Reinforcement Learning via Supervised Learning Installation Run pip install -e . in an environment with Python = 3.7.0, 3.9. The code depends on MuJ

Scott Emmons 49 Nov 28, 2022
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation Demo video: CVPR 2021 Oral: Single Channel Manipulation: Localized or attribu

Zongze Wu 267 Dec 30, 2022
State of the art Semantic Sentence Embeddings

Contrastive Tension State of the art Semantic Sentence Embeddings Published Paper · Huggingface Models · Report Bug Overview This is the official code

Fredrik Carlsson 88 Dec 30, 2022