SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

Last update: Nov 20, 2021

Related tags

Text Data & NLP SASE

Overview

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

We propose a SASE model with adaptive noise distribution, which achieves state of the art results on the VioceBank+DEMAND dataset.
We simulated the federated learning setting of a real environment and verified the robustness of the proposed SASE noise reduction model in a real environment through experiments and visualization.
The proposed SASE model is computed based on the complex domain, and the TF-GA block is used to extract richer information of speech distribution and noise distribution, while SA-GOEA and SA-GUEA are adaptive to learn the distribution mask of noise.
In this paper, we propose a model aggregation optimization weighting strategy that is more applicable to FLbased speech enhancement tasks.

Dependencies

python >=3.6 (3.8.5 was used in the experiments)
PyTorch == 1.10.0+cu113
flwr == 2.0.1

How to run the code

1. Prepare data

VoiceBank+DEMAND can be accessed from this [link](## SUPERSEDED: THIS DATASET HAS BEEN REPLACED. ## Noisy speech database for training speech enhancement algorithms and TTS models)
CommonVoice(Chinese) link +Noise92 [link](NOISEX (cmu.edu))

2. Train on the VoiceBank+DEMAND dataset

python main.py

3. Train on the CommonVoice(Chinese)+Noise92 dataset with Federated learning

./run-server.sh
./run-client.sh
- You can change the number of clients by changing NUM_CLIENTS

4. Generate wav files and evaluate

python main.py -g --resume "model_file" -df "wavs_root"

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

Related tags

Overview

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

Dependencies

How to run the code

1. Prepare data

2. Train on the VoiceBank+DEMAND dataset

3. Train on the CommonVoice(Chinese)+Noise92 dataset with Federated learning

4. Generate wav files and evaluate

Result

1. Evaluate on VoiceBank+DEMAND dataset

2. Evaluate on CommonVoice+Noise92 dataset

Owner

Tower

Code-autocomplete, a code completion plugin for Python

Python port of Google's libphonenumber

一个基于Nonebot2和go-cqhttp的娱乐性qq机器人

New Modeling The Background CodeBase

A combination of autoregressors and autoencoders using XLNet for sentiment analysis

Learning Spatio-Temporal Transformer for Visual Tracking

运小筹公众号是致力于分享运筹优化(LP、MIP、NLP、随机规划、鲁棒优化)、凸优化、强化学习等研究领域的内容以及涉及到的算法的代码实现。

Tool to add main subject to items on Wikidata using a WMFs CirrusSearch for named entity recognition or a manually supplied list of QIDs

TextFlint is a multilingual robustness evaluation platform for natural language processing tasks,

Application for shadowing Chinese.

Knowledge Graph,Question Answering System，基于知识图谱和向量检索的医疗诊断问答系统

Header-only C++ HNSW implementation with python bindings

Awesome-NLP-Research (ANLP)

Arabic-Phonetic-Output - You can input the phonetic version of any Arabic text here. This software will show you output in Arabic (with vowels)

Yodatranslator is a simple translator English to Yoda-language

Anuvada: Interpretable Models for NLP using PyTorch

List of GSoC organisations with number of times they have been selected.

VMD Audio/Text control with natural language

A fast Text-to-Speech (TTS) model. Work well for English, Mandarin/Chinese, Japanese, Korean, Russian and Tibetan (so far). 快速语音合成模型，适用于英语、普通话/中文、日语、韩语、俄语和藏语（当前已测试）。

Fast, DB Backed pretrained word embeddings for natural language processing.