PyTorch implementation of Tacotron speech synthesis model.

Last update: Dec 09, 2022

Overview

tacotron_pytorch

PyTorch implementation of Tacotron speech synthesis model.

Inspired from keithito/tacotron. Currently not as much good speech quality as keithito/tacotron can generate, but it seems to be basically working. You can find some generated speech examples trained on LJ Speech Dataset at here.

If you are comfortable working with TensorFlow, I'd recommend you to try https://github.com/keithito/tacotron instead. The reason to rewrite it in PyTorch is that it's easier to debug and extend (multi-speaker architecture, etc) at least to me.

Requirements

PyTorch
TensorFlow (if you want to run the training script. This definitely can be optional, but for now required.)

Installation

git clone --recursive https://github.com/r9y9/tacotron_pytorch
pip install -e . # or python setup.py develop

If you want to run the training script, then you need to install additional dependencies.

pip install -e ".[train]"

Training

The package relis on keithito/tacotron for text processing, audio preprocessing and audio reconstruction (added as a submodule). Please follows the quick start section at https://github.com/keithito/tacotron and prepare your dataset accordingly.

If you have your data prepared, assuming your data is in "~/tacotron/training" (which is the default), then you can train your model by:

python train.py

Alignment, predicted spectrogram, target spectrogram, predicted waveform and checkpoint (model and optimizer states) are saved per 1000 global step in checkpoints directory. Training progress can be monitored by:

tensorboard --logdir=log

Testing model

Open the notebook in notebooks directory and change checkpoint_path to your model.

PyTorch implementation of Tacotron speech synthesis model.

Related tags

Overview

tacotron_pytorch

Requirements

Installation

Training

Testing model

Owner

Ryuichi Yamamoto

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

Text Classification in Turkish Texts with Bert

Refactored version of FastSpeech2

Python wrapper for Stanford CoreNLP tools v3.4.1

Azure Text-to-speech service for Home Assistant

German Text-To-Speech Engine using Tacotron and Griffin-Lim

This project uses unsupervised machine learning to identify correlations between daily inoculation rates in the USA and twitter sentiment in regards to COVID-19.

Reformer, the efficient Transformer, in Pytorch

Paddlespeech Streaming ASR GUI

NSFW A chatbot based on GPT2-chitchat

A complete NLP guideline for enthusiasts

Task-based datasets, preprocessing, and evaluation for sequence models.

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Code for the Python code smells video on the ArjanCodes channel.

CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

Code for text augmentation method leveraging large-scale language models

PyTorch implementation of Tacotron speech synthesis model.

Related tags

Overview

tacotron_pytorch

Requirements

Installation

Training

Testing model

Owner

Ryuichi Yamamoto

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

Text Classification in Turkish Texts with Bert

Refactored version of FastSpeech2

Python wrapper for Stanford CoreNLP tools v3.4.1

Azure Text-to-speech service for Home Assistant

German Text-To-Speech Engine using Tacotron and Griffin-Lim

This project uses unsupervised machine learning to identify correlations between daily inoculation rates in the USA and twitter sentiment in regards to COVID-19.

Reformer, the efficient Transformer, in Pytorch

Paddlespeech Streaming ASR GUI

**NSFW** A chatbot based on GPT2-chitchat

A complete NLP guideline for enthusiasts

Task-based datasets, preprocessing, and evaluation for sequence models.

Text-Summarization-using-NLP - Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

Code for the Python code smells video on the ArjanCodes channel.

CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

Code for text augmentation method leveraging large-scale language models

NSFW A chatbot based on GPT2-chitchat