PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Last update: Aug 19, 2022

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

This repository contains the PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing", the Sheffield entry for the first Clarity enhancement challenge (CEC1). The system consists of a Conv-TasNet based denoising module, and a finite-inpulse-response (FIR) filter based amplification module. A differentiable approximation to the Cambridge MSBG model released in the CEC1 is used in the loss function.

Requirements

To run the training recipe of the amplification module, the MSBG package and PyTorch STOI are needed.

Training

To build the overall system, the Conv-TasNet based denoising module needs to be trained in the first stage, and the scripts are in the recipe_den_convtasnet. The FIR based amplification module is trained in the second stage, and the scripts are in the recipe_amp_fir. The MBSTOI folder contains the MBSTOI implementation from the CEC1 project, with also the DBSTOI implementation.

References

[1] Luo Y, Mesgarani N. Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation[J]. IEEE/ACM transactions on audio, speech, and language processing, 2019, 27(8): 1256-1266.
[2] Andersen A H, de Haan J M, Tan Z H, et al. Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions[J]. Speech Communication, 2018, 102: 1-13.
[3] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', ICASSP 2010, Texas, Dallas.

Citation

If you use this work, please cite:

@article{tutwo,
  title={A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing},
  author={Tu, Zehai and Zhang, Jisi and Ma, Ning and Barker, Jon},
  year={2021},
  booktitle={The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021)},
}

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Related tags

Overview

Implementation of the Sheffield entry for the first Clarity enhancement challenge (CEC1)

Requirements

Training

References

Citation

Owner

This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.

Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

Your interactive network visualizing dashboard

Demystifying How Self-Supervised Features Improve Training from Noisy Labels

SAT Project - The first project I had done at General Assembly, performed EDA, data cleaning and created data visualizations

Detail-Preserving Transformer for Light Field Image Super-Resolution

Plover-tapey-tape: an alternative to Plover’s built-in paper tape

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

A general python framework for visual object tracking and video object segmentation, based on PyTorch

Robot Hacking Manual (RHM). From robotics to cybersecurity. Papers, notes and writeups from a journey into robot cybersecurity.

Code for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun

Official PyTorch implementation of PS-KD

Histology images query (unsupervised)

potpourri3d - An invigorating blend of 3D geometry tools in Python.

Project dự đoán giá cổ phiếu bằng thuật toán LSTM gồm: code train và code demo

BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

This repository provides an efficient PyTorch-based library for training deep models.

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

Machine learning for NeuroImaging in Python