An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Python implementation of image filters (such as brightness, contrast, saturation, etc.)

PyPhotoshop Python implementation of image filters Use Python to adjust brightness and contrast, add blur, and detect edges! Follow along tutorial: ht

Kylie 87 Dec 15, 2022
Pure Python bindings for the pure C++11/OpenCL Qrack quantum computer simulator library

pyqrack Pure Python bindings for the pure C++11/OpenCL Qrack quantum computer simulator library (PyQrack is just pure Qrack.) IMPORTANT: You must buil

vm6502q 6 Jul 21, 2022
Random collage/montage generator with drop-shadow

Random Collage Example Usage These are the sample input files in $PWD for the below examples: 1.png 2.png 3.png 4.png 5.png 6.png 7.png 8.png 9.png 10

M B 1 Dec 07, 2021
PyGram Instagram-like image filters.

PyGram Instagram-like image filters. Usage First, import the client: from filters import * Instanciate a filter and apply it: f = Nashville("image.jp

Ajay Kumar Nagaraj 102 Feb 21, 2022
reversable image censoring tool

StupidCensor a REVERSABLE image censoring tool to reversably mosiac censor jpeg files to temporarily remove image details not allowed on most websites

2 Jan 28, 2022
NFT collection generator. Generates layered images

NFT collection generator Generates layered images, whole collections. Provides additional functionality. Repository includes three scripts generate.py

Gleb Gonchar 10 Nov 15, 2022
Alternate Python bindings for the Open Asset Import Library (ASSIMP)

Impasse A simple Python wrapper for assimp using cffi to access the library. Requires Python = 3.7. It's a fork of PyAssimp, Assimp's official Python

Salad Dais 3 Sep 26, 2022
API to help generating QR-code for ZATCA's e-invoice known as Fatoora with any programming language

You can try it @ api-fatoora api-fatoora API to help generating QR-code for ZATCA's e-invoice known as Fatoora with any programming language Disclaime

نافع الهلالي 12 Oct 05, 2022
Python class that generates pixel art from images

Python class that generates pixel art from images

Richard Nagyfi 1.4k Dec 29, 2022
Viewer for NFO files

NFO Viewer NFO Viewer is a simple viewer for NFO files, which are "ASCII" art in the CP437 codepage. The advantages of using NFO Viewer instead of a t

Osmo Salomaa 114 Dec 29, 2022
PSD (Photoshop, Krita, Gimp...) -> Godot.

limage v0.2.2 Features Getting Started Tags Settings Todo Customizer Changes Solutions WARNING: Requires Python to be installed PSD (Photoshop, Krita,

21 Nov 10, 2022
Imutils - A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

imutils A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displ

PyImageSearch 4.3k Jan 01, 2023
Transfers a image file(.png) to an Excel file(.xlsx)

Transfers a image file(.png) to an Excel file(.xlsx)

Junu Kwon 7 Feb 11, 2022
Bringing vtk.js into Dash and Python

Dash VTK Dash VTK lets you integrate the vtk.js visualization pipeline directly into your Dash app. It is powered by react-vtk-js. Docs Demo Explorer

Plotly 88 Nov 29, 2022
Image2scan - a python program that can be applied on an image in order to get a scan of it back

image2scan Purpose image2scan is a python program that can be applied on an image in order to get a scan of it back. For this purpose, it searches for

Kushal Shingote 2 Feb 13, 2022
Console images in 48 colors, 216 colors and full rgb

console_images Console images in 48 colors, 216 colors and full rgb Full RGB 216 colors 48 colors If it does not work maybe you should change color_fu

Урядов Алексей 5 Oct 11, 2022
3D Model files and source code for rotating turntable. Raspberry Pi, DC servo and PWM modulator required.

3DSimpleTurntable 3D Model files and source code for rotating turntable. Raspberry Pi, DC servo and PWM modulator required. Preview Construction Print

Thomas Boyle 1 Feb 13, 2022
Image manipulation package used for EpicBot.

Image manipulation package used for EpicBot.

Nirlep_5252_ 7 May 26, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Jan 04, 2023
A Python3 library to generate dynamic SVGs

The Python library for generating dynamic SVGs using Python3

1 Dec 23, 2021