Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Overview

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer

arXiv

Description

Convert offline handwritten mathematical expression to LaTeX sequence using bidirectionally trained transformer.

How to run

First, install dependencies

# clone project   
git clone https://github.com/Green-Wood/BTTR

# install project   
cd BTTR
conda create -y -n bttr python=3.7
conda activate bttr
conda install --yes -c pytorch pytorch=1.7.0 torchvision cudatoolkit=<your-cuda-version>
pip install -e .   

Next, navigate to any file and run it. It may take 6~7 hours to coverage on 4 gpus using ddp.

# module folder
cd BTTR

# train bttr model using 4 gpus and ddp
python train.py --config config.yaml  

For single gpu user, you may change the config.yaml file to

gpus: 1
# gpus: 4
# accelerator: ddp

Imports

This project is setup as a package which means you can now easily import any file into any other file like so:

from bttr.datamodule import CROHMEDatamodule
from bttr import LitBTTR
from pytorch_lightning import Trainer

# model
model = LitBTTR()

# data
dm = CROHMEDatamodule(test_year=test_year)

# train
trainer = Trainer()
trainer.fit(model, datamodule=dm)

# test using the best model!
trainer.test(datamodule=dm)

Note

Metrics used in validation is not accurate.

For more accurate metrics:

  1. use test.py to generate result.zip
  2. download and install crohmelib, lgeval, and tex2symlg tool.
  3. convert tex file to symLg file using tex2symlg command
  4. evaluate two folder using evaluate command

Citation

@article{zhao2021handwritten,
  title={Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer},
  author={Zhao, Wenqi and Gao, Liangcai and Yan, Zuoyu and Peng, Shuai and Du, Lin and Zhang, Ziyin},
  journal={arXiv preprint arXiv:2105.02412},
  year={2021}
}
Comments
  • can you provide predict.py code?

    can you provide predict.py code?

    Hi ~ @Green-Wood.

    I feel grateful mind for your help. I wanna get predict.py code that prints latex from an input image. If this code is provided, it will be very useful to others as well.

    Best regards.

    opened by ai-motive 17
  • val_exprate=0 and save checkpoint

    val_exprate=0 and save checkpoint

    hello!thanks for your time! When I transfer some code in decoder or use it directly,the val_exprate are always be 0.000,I don't know why. Another problem is,I noticed that this code don't have the function to save checkpoint or something.Can you give me some help?Thanks again!

    opened by Ashleyyyi 6
  • Val_exprate = 0

    Val_exprate = 0

    When I retrained the model according to the instruction, the val_exprate was always 0.00, did anyone encounter this problem, thank you! (I has not modified any codes) @Green-Wood

    opened by qingqianshuying 4
  • test.py error occurs

    test.py error occurs

    When I run test.py code, the following error occurs. Can i get some helps?

    in test.py code test_year = "2016" ckp_path = "pretrained model"

    GPU available: True, used: True
    TPU available: False, using: 0 TPU cores
    Load data from: /home/motive/PycharmProjects/BTTR/bttr/datamodule/../../data.zip
    Extract data from: 2016, with data size: 1147
    total  1147 batch data loaded
    LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
    Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.01s/it]ExpRate: 0.32258063554763794
    length of total file: 1147
    Testing: 100%|██████████| 1147/1147 [07:34<00:00,  2.52it/s]
    --------------------------------------------------------------------------------
    DATALOADER:0 TEST RESULTS
    {}
    --------------------------------------------------------------------------------
    Traceback (most recent call last):
      File "/home/motive/PycharmProjects/BTTR/test.py", line 17, in <module>
        trainer.test(model, datamodule=dm)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in test
        results = self._run(model)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 759, in _run
        self.post_dispatch()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 789, in post_dispatch
        self.accelerator.teardown()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/accelerators/gpu.py", line 51, in teardown
        self.lightning_module.cpu()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/pytorch_lightning/utilities/device_dtype_mixin.py", line 141, in cpu
        return super().cpu()
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in cpu
        return self._apply(lambda t: t.cpu())
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 359, in _apply
        module._apply(fn)
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in _apply
        setattr(this, key, [fn(cur_v) for cur_v in current_val])
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torchmetrics/metric.py", line 317, in <listcomp>
        setattr(this, key, [fn(cur_v) for cur_v in current_val])
      File "/home/motive/anaconda3/envs/bttr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 471, in <lambda>
        return self._apply(lambda t: t.cpu())
    AttributeError: 'tuple' object has no attribute 'cpu'
    
    opened by ai-motive 3
  • How long does BTTR take to train?

    How long does BTTR take to train?

    Hi, thank you for great repository!

    How long does it take to train for your experiment in the paper? I mean training on CROHME 2014/2016/2019 on four NVIDIA 1080Ti GPUs.

    Thanks,

    opened by RyosukeFukatani 2
  • can you provide transfer learning code?

    can you provide transfer learning code?

    Hi~ @Green-Wood

    I wanna apply trasnfer learning using pretrained model.

    but, LightningCLI() is wrapped and difficult to customize.

    Thanks & best regards.

    opened by ai-motive 1
  • How can it get pretrained model ?

    How can it get pretrained model ?

    Hi, I wanna test your BTTR model but, it need to training process which will take a lot of time. So, can you give me a pretrained model link?

    Best regards.

    opened by ai-motive 1
  • After adding new token in dictionary getting error .

    After adding new token in dictionary getting error .

    Hi , getting error after adding new token in dictionary.txt

    Error(s) in loading state_dict for LitBTTR: size mismatch for bttr.decoder.word_embed.0.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.weight: copying a param with shape torch.Size([113, 256]) from checkpoint, the shape in current model is torch.Size([115, 256]). size mismatch for bttr.decoder.proj.bias: copying a param with shape torch.Size([113]) from checkpoint, the shape in current model is torch.Size([115]).

    Kindly help me out how can i fix this error.

    opened by shivankaraditi 0
  • About dataset

    About dataset

    Could you tell me how to generate the offline math expression image from inkml file? My experiment show that a large scale image could improve the result obviously,so I'd like to know if there is unified offline data for academic research.

    opened by lightflash7 0
  • predicting on gpu is slower

    predicting on gpu is slower

    Hi ,

    As this model is a bit slower compared to the existing state-of-the-art model on CPU. So I tried to make predictions on GPU and surprisingly it slower on Gpu compare to CPU as well.

    I am attaching a code snapshot here

    device = torch.device('cuda')if torch.cuda.is_available() else torch.device('cpu')

    model = LitBTTR.load_from_checkpoint('pretrained-2014.ckpt',map_location=device)

    img = Image.open(img_path) img = ToTensor()(img) img.to(device)

    t1 = time.time() hyp = model.beam_search(img) t2 = time.time()

    Kindly help me out here how i can reduce prediction time

    FYI - using GPU on aws g4dn.xlarge configuration machine

    opened by Suma3 1
  • how to use TensorBoard?

    how to use TensorBoard?

    hello i don't know how to add scalar to TensorBoard? I want to do this kind of topic, hoping to improve some ExpRate, but I don’t know much about lightning TensorBoard.

    opened by win5923 9
Releases(v2.0)
Owner
Wenqi Zhao
Student in Nanjing University
Wenqi Zhao
Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

LMFD-PAD Note This is the official repository of the paper: LMFD-PAD: Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechani

28 Dec 02, 2022
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

chitra What is chitra? chitra (चित्र) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M

Aniket Maurya 210 Dec 21, 2022
Reliable probability face embeddings

ProbFace, arxiv This is a demo code of training and testing [ProbFace] using Tensorflow. ProbFace is a reliable Probabilistic Face Embeddging (PFE) me

Kaen Chan 34 Dec 31, 2022
GNPy: Optical Route Planning and DWDM Network Optimization

GNPy is an open-source, community-developed library for building route planning and optimization tools in real-world mesh optical networks

Telecom Infra Project 140 Dec 19, 2022
Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Omninet - Pytorch Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch. The authors propose that we should be atte

Phil Wang 48 Nov 21, 2022
Nodule Generation Algorithm Baseline and template code for node21 generation track

Nodule Generation Algorithm This codebase implements a simple baseline model, by following the main steps in the paper published by Litjens et al. for

node21challenge 10 Apr 21, 2022
Face recognize and crop them

Face Recognize Cropping Module Source 아이디어 Face Alignment with OpenCV and Python Requirement 필요 라이브러리 imutil dlib python-opence (cv2) Usage 사용 방법 open

Cho Moon Gi 1 Feb 15, 2022
Pytorch implementation of Nueral Style transfer

Nueral Style Transfer Pytorch implementation of Nueral style transfer algorithm , it is used to apply artistic styles to content images . Content is t

Abhinav 9 Oct 15, 2022
Implicit Model Specialization through DAG-based Decentralized Federated Learning

Federated Learning DAG Experiments This repository contains software artifacts to reproduce the experiments presented in the Middleware '21 paper "Imp

Operating Systems and Middleware Group 5 Oct 16, 2022
Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

SUO-SLAM This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv li

Robot Perception & Navigation Group (RPNG) 97 Jan 03, 2023
[CVPR 2021] Monocular depth estimation using wavelets for efficiency

Single Image Depth Prediction with Wavelet Decomposition Michaël Ramamonjisoa, Michael Firman, Jamie Watson, Vincent Lepetit and Daniyar Turmukhambeto

Niantic Labs 205 Jan 02, 2023
PyTorch implementation of CloudWalk's recent work DenseBody

densebody_pytorch PyTorch implementation of CloudWalk's recent paper DenseBody. Note: For most recent updates, please check out the dev branch. Update

Lingbo Yang 401 Nov 19, 2022
A simple code to perform canny edge contrast detection on images.

CECED-Canny-Edge-Contrast-Enhanced-Detection A simple code to perform canny edge contrast detection on images. A simple code to process images using c

Happy N. Monday 3 Feb 15, 2022
Anti-UAV base on PaddleDetection

Paddle-Anti-UAV Anti-UAV base on PaddleDetection Background UAVs are very popular and we can see them in many public spaces, such as parks and playgro

Qingzhong Wang 2 Apr 20, 2022
Use tensorflow to implement a Deep Neural Network for real time lane detection

LaneNet-Lane-Detection Use tensorflow to implement a Deep Neural Network for real time lane detection mainly based on the IEEE IV conference paper "To

MaybeShewill-CV 1.9k Jan 08, 2023
Campsite Reservation Finder

yellowstone-camping UPDATE: yellowstone-camping is being expanded and renamed to camply. The updated tool now interfaces with the Recreation.gov API a

Justin Flannery 233 Jan 08, 2023
Classify the disease status of a plant given an image of a passion fruit

Passion Fruit Disease Detection I tried to create an accurate machine learning models capable of localizing and identifying multiple Passion Fruits in

3 Nov 09, 2021
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs

Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs This is an implemetation of the paper Few-shot Relation Extraction via Baye

MilaGraph 36 Nov 22, 2022
Tesla Light Show xLights Guide With python

Tesla Light Show xLights Guide Welcome to the Tesla Light Show xLights guide! You can create and run your own light shows on Tesla vehicles. Running a

Tesla, Inc. 2.5k Dec 29, 2022
Si Adek Keras is software VR dangerous object detection.

Si Adek Python Keras Sistem Informasi Deteksi Benda Berbahaya Keras Python. Version 1.0 Developed by Ananda Rauf Maududi. Developed date: 24 November

Ananda Rauf 1 Dec 21, 2021