Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

Last update: Jan 03, 2023

Overview

CTC Decoding Algorithms

Update 2021: installable Python package

Python implementation of some common Connectionist Temporal Classification (CTC) decoding algorithms. A minimalistic language model is provided.

Installation

Go to the root level of the repository
Execute pip install .
Go to tests/ and execute pytest to check if installation worked

Usage

Basic usage

Here is a minimalistic executable example:

import numpy as np
from ctc_decoder import best_path, beam_search

mat = np.array([[0.4, 0, 0.6], [0.4, 0, 0.6]])
chars = 'ab'

print(f'Best path: "{best_path(mat, chars)}"')
print(f'Beam search: "{beam_search(mat, chars)}"')

The output mat (numpy array, softmax already applied) of the CTC-trained neural network is expected to have shape TxC and is passed as the first argument to the decoders. T is the number of time-steps, and C the number of characters (the CTC-blank is the last element). The characters that can be predicted by the neural network are passed as the chars string to the decoder. Decoders return the decoded string.
Running the code outputs:

Best path: ""
Beam search: "a"

To see more examples on how to use the decoders, please have a look at the scripts in the tests/ folder.

Language model and BK-tree

Beam search can optionally integrate a character-level language model. Text statistics (bigrams) are used by beam search to improve reading accuracy.

from ctc_decoder import beam_search, LanguageModel

# create language model instance from a (large) text
lm = LanguageModel('this is some text', chars)

# and use it in the beam search decoder
res = beam_search(mat, chars, lm=lm)

The lexicon search decoder computes a first approximation with best path decoding. Then, it uses a BK-tree to retrieve similar words, scores them and finally returns the best scoring word. The BK-tree is created by providing a list of dictionary words. A tolerance parameter defines the maximum edit distance from the query word to the returned dictionary words.

from ctc_decoder import lexicon_search, BKTree

# create BK-tree from a list of words
bk_tree = BKTree(['words', 'from', 'a', 'dictionary'])

# and use the tree in the lexicon search
res = lexicon_search(mat, chars, bk_tree, tolerance=2)

Usage with deep learning frameworks

Some notes:

No adapter for TensorFlow or PyTorch is provided
Apply softmax already in the model
Convert to numpy array
Usually, the output of an RNN layer rnn_output has shape TxBxC, with B the batch dimension
- Decoders work on single batch elements of shape TxC
- Therefore, iterate over all batch elements and apply the decoder to each of them separately
- Example: extract matrix of batch element 0 mat = rnn_output[:, 0, :]
The CTC-blank is expected to be the last element along the character dimension
- TensorFlow has the CTC-blank as last element, so nothing to do here
- PyTorch, however, has the CTC-blank as first element by default, so you have to move it to the end, or change the default setting

List of provided decoders

Recommended decoders:

best_path: best path (or greedy) decoder, the fastest of all algorithms, however, other decoders often perform better
beam_search: beam search decoder, optionally integrates a character-level language model, can be tuned via the beam width parameter
lexicon_search: lexicon search decoder, returns the best scoring word from a dictionary

Other decoders, from my experience not really suited for practical purposes, but might be used for experiments or research:

prefix_search: prefix search decoder
token_passing: token passing algorithm
Best path decoder implementation in OpenCL (see extras/ folder)

This paper gives suggestions when to use best path decoding, beam search decoding and token passing.

Documentation of test cases and data

Documentation of test cases
Documentation of the data

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

Related tags

Overview

CTC Decoding Algorithms

Installation

Usage

Basic usage

Language model and BK-tree

Usage with deep learning frameworks

List of provided decoders

Documentation of test cases and data

References

Owner

Harald Scheidl

RecipeReduce: Simplified Recipe Processing for Lazy Programmers

मराठी भाषा वाचविण्याचा एक प्रयास. इंग्रजी ते मराठीचा शब्दकोश. An attempt to preserve the Marathi language. A lightweight and ad free English to Marathi thesaurus.

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

【原神】自动演奏风物之诗琴的程序

This repository contains all the source code that is needed for the project : An Efficient Pipeline For Bloom’s Taxonomy Using Natural Language Processing and Deep Learning

Sentiment Analysis Project using Count Vectorizer and TF-IDF Vectorizer

基于GRU网络的句子判断程序/A program based on GRU network for judging sentences

Long text token classification using LongFormer

A flask application to predict the speech emotion of any .wav file.

A collection of Classical Chinese natural language processing models, including Classical Chinese related models and resources on the Internet.

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

An extensive UI tool built using new data scraped from BBC News

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

Code for the ACL 2021 paper "Structural Guidance for Transformer Language Models"

Associated Repository for "Translation between Molecules and Natural Language"

Malware-Related Sentence Classification