EfficientTTS

Unofficial Pytorch implementation of "EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture"(arXiv).

Disclaimer: Somebody mistakenly think I'm one of the authors. In fact, I am not even in the author list of this paper. I am just a TTS enthusiast. Some important information of the implementation is not presented by the paper. Some model parameters in current version is based on my understanding and exepriments, which may not be consistent with those used by the authors.

Updates

2020/12/23: Mandarin Chinese Samples uploaded. The experiment setting is exactly the same with the LJSpeech example. A complete description of the usage will be soon uploaded.

2020/12/20: Using the HifiGAN finetuned with Tacotron2 GTA mel spectrograms can increase the quality of the generated samples, please see the newly generated-samples

Current status

Implementation of EFTS-CNN + HifiGAN

Setup with virtualenv

$ cd tools
$ make
# If you want to use distributed training, please run following
# command to install apex.
$ make apex

Note: If you want to specify Python version, CUDA version or PyTorch version, please run for example:

$ make PYTHON=3.7 CUDA_VERSION=10.1 PYTORCH_VERSION=1.6

Training

Please go to egs/lj folder, and see run.sh for example use.

Acknowledgement

The code framework is from https://github.com/kan-bayashi/ParallelWaveGAN

Pytorch implementation of

Related tags

Overview

EfficientTTS

Unofficial Pytorch implementation of "EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture"(arXiv).

Updates

Current status

Setup with virtualenv

Training

Acknowledgement

Owner

Liu Songxiang

Meandering In Networks of Entities to Reach Verisimilar Answers

GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition

TensorFlow implementation of "Variational Inference with Normalizing Flows"

Deeprl - Standard DQN and dueling network for simple games

Deep learning for spiking neural networks

Code repo for "Towards Interpretable Deep Networks for Monocular Depth Estimation" paper.

Using VapourSynth with super resolution models and speeding them up with TensorRT.

Code for Fold2Seq paper from ICML 2021

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

This script runs neural style transfer against the provided content image.

Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

PyTorch Implementation of Spatially Consistent Representation Learning(SCRL)

Tutorial page of the Climate Hack, the greatest hackathon ever

Mscp jamf - Build compliance in jamf

This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

YOLOPのPythonでのONNX推論サンプル

An experiment to bait a generalized frontrunning MEV bot

EasyMocap is an open-source toolbox for markerless human motion capture from RGB videos.

This is the repository for Learning to Generate Piano Music With Sustain Pedals