The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Last update: Dec 02, 2022

Related tags

Deep Learning SF-Net

Overview

SF-Net for fullband SE

This is the repo of the manuscript "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement", which is submitted to Interspecch 2022. Some audio samples are provided here and the code for GCRN-full, DS-Net-full, CTS-Net-full and the network configuration of SF-Net are released.

Abstract：Due to the high computational complexity to model more frequency bands, it is still intractable to conduct real-time full-band speech enhancement based on deep neural networks. Recent studies typically utilize the compressed perceptually motivated features with relatively low frequency resolution to filter the full-band spectrum by one-stage networks, leading to limited speech quality improvements. In this paper, we propose a coordinated sub-band fusion network for full-band speech enhancement, which aims to recover the low- (0-8 kHz), middle- (8-16 kHz), and high-band (16-24 kHz) in a step-wise manner. Specifically, a dual-stream network is first pretrained to recover the low-band complex spectrum, and another two sub-networks are designed as the middle- and high-band noise suppressors in the magnitude-only domain. To fully capitalize on the information intercommunication, we employ a sub-band interaction module to provide external knowledge guidance across different frequency bands. Extensive experiments show that the proposed method yields consistent performance advantages over state-of-the-art full-band baselines.

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

Related tags

Overview

SF-Net for fullband SE

Demo page of audio samples

System flowchart of SF-Net

Results:

Abaltion study

Comparison with SOTA

Visualization of spectrograms

VB dataset

DNS blind set

Owner

Guochen Yu

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Fast Axiomatic Attribution for Neural Networks (NeurIPS*2021)

This repository contains the official code of the paper Equivariant Subgraph Aggregation Networks (ICLR 2022)

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Unofficial PyTorch reimplementation of the paper Swin Transformer V2: Scaling Up Capacity and Resolution

Codes for paper "KNAS: Green Neural Architecture Search"

NasirKhusraw - The TSP solved using genetic algorithm and show TSP path overlaid on a map of the Iran provinces & their capitals.

Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks

Python scripts form performing stereo depth estimation using the HITNET model in Tensorflow Lite.

Implementation for Curriculum DeepSDF

Fast, general, and tested differentiable structured prediction in PyTorch

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

METER: Multimodal End-to-end TransformER

Deep Sea Treasure Environment for Multi-Objective Optimization Research

Official implementation of Protected Attribute Suppression System, ICCV 2021

Attention for PyTorch with Linear Memory Footprint

Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

Software Platform for solving and manipulating multiparametric programs in Python