An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Sample data for the napari image viewer.

napari-demo-data Sample data for the napari image viewer. This napari plugin was generated with Cookiecutter using @napari's cookiecutter-napari-plugi

Genevieve Buckley 1 Nov 08, 2021
Python-fu-cartoonify - GIMP plug-in to turn a photo into a cartoon.

python-fu-cartoonify GIMP plug-in to turn a photo into a cartoon. Preview Installation Copy python-fu-cartoonify.py into the plug-in folder listed und

Pascal Reitermann 6 Aug 05, 2022
Convert any binary data to a PNG image file and vice versa.

What is PngBin? The name PngBin comes from an image format file extension PNG (Portable Network Graphics) and the word Binary. An image produced by Pn

Nathan Young 87 Dec 22, 2022
Convert any image into greyscale ASCII art.

Image-to-ASCII Convert any image into greyscale ASCII art.

Ben Smith 12 Jan 15, 2022
Kimimaro: Skeletonize Densely Labeled Images

Kimimaro: Skeletonize Densely Labeled Images # Produce SWC files from volumetric images. kimimaro forge labels.npy --progress # writes to ./kimimaro_o

92 Dec 17, 2022
A python program to generate ANSI art from images and videos

ANSI Art Generator A python program that creates ASCII art (with true color support if enabled) from images and videos Dependencies The program runs u

Pratyush Kumar 12 Nov 08, 2022
A Robust Avatar Generator with a huge number of templates

CoolAvatars Welcome to this repository of CoolAvatars. Using this project, you can generate cool avatars not only from the samples present in my image

RAVI PRAKASH 5 Oct 12, 2021
New program to export a Blender model to the LBA2 model format.

LBA2 Blender to Model 2 This is a new program to export a Blender model to the LBA2 model format. This is also the first publicly released version of

2 Nov 30, 2022
Simple utility to tinker with OPlus images

OPlus image utilities Prerequisites Linux running kernel 5.4 or up (check with uname -r) Image rebuilding Used to rebuild read-only erofs images into

Wiley Lau 15 Dec 28, 2022
This script is for photographers to do timeslice with one click.

One Click TimeSlice Tool What is this for This is for photographers who want to create TimeSlice pictures without installing PS plugins. Before using

Xi Zhao 13 Sep 23, 2022
Gaphor is the simple modeling tool

Gaphor Gaphor is a UML and SysML modeling application written in Python. It is designed to be easy to use, while still being powerful. Gaphor implemen

Gaphor 1.3k Dec 31, 2022
thumbor is an open-source photo thumbnail service by globo.com

Survey If you use thumbor, please take 1 minute and answer this survey? It's only 2 questions and one is multiple choice!!! thumbor is a smart imaging

Thumbor (by @globocom) 9.3k Dec 31, 2022
An API which would colorize a black and white image

Image Colorization API Machine Learning Model used- https://github.com/richzhang/colorization/tree/caffe Paper - https://arxiv.org/abs/1603.08511 Step

Neelesh Ranjan Jha 4 Nov 23, 2021
Python scripts for semi-automated morphometric analysis of atolls from Landsat satellite Imagery.

AtollGeoMorph Python scripts for semi-automated morphometric analysis of atolls from Landsat satellite Imagery. The python scripts included allow user

1 Dec 16, 2022
Glyph-graph - A simple, yet versatile, package for graphing equations on a 2-dimensional text canvas

Glyth Graph Revision for 0.01 A simple, yet versatile, package for graphing equations on a 2-dimensional text canvas List of contents: Brief Introduct

Ivan 2 Oct 21, 2022
image-processing exercises.

image_processing Assignment 21 Checkered Board Create a chess table using numpy and opencv. view: Color Correction Reverse black and white colors with

Benyamin Zojaji 25 Dec 15, 2022
Console images in 48 colors, 216 colors and full rgb

console_images Console images in 48 colors, 216 colors and full rgb Full RGB 216 colors 48 colors If it does not work maybe you should change color_fu

Урядов Алексей 5 Oct 11, 2022
Generate waves art for an image

waves-art Generate waves art for an image. Requirements: OpenCV Numpy Example Usage python waves_art.py --image_path tests/test1.jpg --patch_size 15 T

Hamza Rawal 18 Apr 04, 2022
Fill holes in binary 2D & 3D images fast.

Fill holes in binary 2D & 3D images fast.

11 Dec 09, 2022
The coolest python qrcode maker for small businesses.

QR.ify The coolest python qrcode maker for small businesses. Author Zach Yusuf Project description Python final project. Built to test python skills P

zachystuff 2 Jan 14, 2022