Fast and Simple Neural Vocoder, the Multiband RNNMS

Last update: Jan 11, 2022

Related tags

Deep Learning MultibandRNNMS

Overview

Multiband RNN_MS

Fast and Simple vocoder, Multiband RNN_MS.

Demo
Quick training
How to Use
System Details
Results
References

Demo

ToDO: Link super great impressive high-quatity audio demo.

Quick Training

Jump to ☞ , then Run. That's all!

How to Use

1. Install

# pip install "torch==1.10.0" -q      # Based on your environment (validated with v1.10)
# pip install "torchaudio==0.10.0" -q # Based on your environment
pip install git+https://github.com/tarepan/MultibandRNNMS

2. Data & Preprocessing

"Batteries Included".
RNNMS transparently download corpus and preprocess it for you 😉

3. Train

python -m mbrnnms.main_train

For arguments, check ./mbrnnms/config.py

Advanced: Other datasets

You can switch dataset with arguments.
All speechcorpusy's preset corpuses are supported.

# LJSpeech corpus
python -m mbrnnms.main_train data.data_name=LJ

Advanced: Custom dataset

Copy mbrnnms.main_train and replace DataModule.

    # datamodule = LJSpeechDataModule(batch_size, ...)
    datamodule = YourSuperCoolDataModule(batch_size, ...)
    # That's all!

System Details

Model

PreNet: GRU
Upsampler: time-directional nearest interpolation
Decoder: Embedding-auto-regressive generative RNN with 10-bit μ-law encoding

Results

Output Sample

Demo

Performance

X [iter/sec] @ NVIDIA T4 on Google Colaboratory (AMP+, num_workers=8)

It takes about Ydays for full training.

References

Acknowlegements

: Basic vocoder concept came from this paper.
bshall/UniversalVocoding: Model and hyperparams are derived from this repository. All codes are re-written.

Fast and Simple Neural Vocoder, the Multiband RNNMS

Related tags

Overview

Multiband RNN_MS

Demo

Quick Training

How to Use

1. Install

2. Data & Preprocessing

3. Train

Advanced: Other datasets

Advanced: Custom dataset

System Details

Model

Results

Output Sample

Performance

References

Acknowlegements

Owner

tarepan

OpenVINO黑客松比赛项目

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Wordle Env: A Daily Word Environment for Reinforcement Learning

This repository contains the scripts for downloading and validating scripts for the documents

This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

Python version of the amazing Reaction Mechanism Generator (RMG).

StellarGraph - Machine Learning on Graphs

End-to-end Temporal Action Detection with Transformer. [Under review]

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Vignette is a face tracking software for characters using osu!framework.

[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

Revealing and Protecting Labels in Distributed Training

[ICCV 2021] Deep Hough Voting for Robust Global Registration

frida工具的缝合怪

A repo for Causal Imitation Learning under Temporally Correlated Noise

Pytorch Implementation of Residual Vision Transformers(ResViT)

Comp445 project - Data Communications & Computer Networks

High performance distributed framework for training deep learning recommendation models based on PyTorch.

RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids