A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

Last update: Dec 02, 2022

Overview

torch-cif

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.

Usage

def cif_function(
    input: Tensor,
    alpha: Tensor,
    beta: float = 1.0,
    padding_mask: Optional[Tensor] = None,
    target_lengths: Optional[Tensor] = None,
    max_output_length: Optional[int] = None,
    eps: float = 1e-4,
) -> Tuple[Tensor, Tensor, Tensor]:
    r""" A batched computation implementation of continuous integrate and fire (CIF)
    https://arxiv.org/abs/1905.11235

    Args:
        input (Tensor): (N, S, C) Input features to be integrated.
        alpha (Tensor): (N, S) Weights corresponding to each elements in the
            input. It is expected to be after sigmoid function.
        beta (float): the threshold used for determine firing.
        padding_mask (Tensor, optional): (N, S) A binary mask representing
            padded elements in the input.
        target_lengths (Tensor, optional): (N,) Desired length of the targets
            for each sample in the minibatch.
        max_output_length (int, optional): The maximum valid output length used
            in inference. The alpha is scaled down if the sum exceeds this value.
        eps (float, optional): Epsilon to prevent underflow for divisions.
            Default: 1e-4

    Returns: Tuple (output, feat_lengths, alpha_sum)
        output (Tensor): (N, T, C) The output integrated from the source.
        feat_lengths (Tensor): (N,) The output length for each element in batch.
        alpha_sum (Tensor): (N,) The sum of alpha for each element in batch.
            Can be used to compute the quantity loss.
    """

Note

ℹ️ This is a WIP project. the implementation is still being tested.

This implementation uses cumsum and floor to determine the firing positions, and use scatter to merge the weighted source features.
Run test by python test.py (requires pip install expecttest).
Feel free to contact me if there are bugs in the code.

Reference

CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

Related tags

Overview

torch-cif

Usage

Note

Reference

Owner

張致強

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Implement slightly different caffe-segnet in tensorflow

A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

Code for paper "Learning to Reweight Examples for Robust Deep Learning"

Toolbox of models, callbacks, and datasets for AI/ML researchers.

OMAMO: orthology-based model organism selection

PyTorch common framework to accelerate network implementation, training and validation

meProp: Sparsified Back Propagation for Accelerated Deep Learning (ICML 2017)

[ECCVW2020] Robust Long-Term Object Tracking via Improved Discriminative Model Prediction (RLT-DiMP)

Fast Differentiable Matrix Sqrt Root

Deep ViT Features as Dense Visual Descriptors

Categorizing comments on YouTube into different categories.

Commonsense Ability Tests

Image Processing, Image Smoothing, Edge Detection and Transforms

Python binding for Khiva library.

PyStan, a Python interface to Stan, a platform for statistical modeling. Documentation: https://pystan.readthedocs.io