PyTorch implementation of Tacotron speech synthesis model.

Last update: Dec 09, 2022

Overview

tacotron_pytorch

PyTorch implementation of Tacotron speech synthesis model.

Inspired from keithito/tacotron. Currently not as much good speech quality as keithito/tacotron can generate, but it seems to be basically working. You can find some generated speech examples trained on LJ Speech Dataset at here.

If you are comfortable working with TensorFlow, I'd recommend you to try https://github.com/keithito/tacotron instead. The reason to rewrite it in PyTorch is that it's easier to debug and extend (multi-speaker architecture, etc) at least to me.

Requirements

PyTorch
TensorFlow (if you want to run the training script. This definitely can be optional, but for now required.)

Installation

git clone --recursive https://github.com/r9y9/tacotron_pytorch
pip install -e . # or python setup.py develop

If you want to run the training script, then you need to install additional dependencies.

pip install -e ".[train]"

Training

The package relis on keithito/tacotron for text processing, audio preprocessing and audio reconstruction (added as a submodule). Please follows the quick start section at https://github.com/keithito/tacotron and prepare your dataset accordingly.

If you have your data prepared, assuming your data is in "~/tacotron/training" (which is the default), then you can train your model by:

python train.py

Alignment, predicted spectrogram, target spectrogram, predicted waveform and checkpoint (model and optimizer states) are saved per 1000 global step in checkpoints directory. Training progress can be monitored by:

tensorboard --logdir=log

Testing model

Open the notebook in notebooks directory and change checkpoint_path to your model.

PyTorch implementation of Tacotron speech synthesis model.

Related tags

Overview

tacotron_pytorch

Requirements

Installation

Training

Testing model

Owner

Ryuichi Yamamoto

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

Yolov5 + Deep Sort with PyTorch

Benchmark for Answering Existential First Order Queries with Single Free Variable

Library extending Jupyter notebooks to integrate with Apache TinkerPop and RDF SPARQL.

Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

Learning Tracking Representations via Dual-Branch Fully Transformer Networks

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

Code for STFT Transformer used in BirdCLEF 2021 competition.

Event-forecasting - Event Forecasting Algorithms With Python

git《Commonsense Knowledge Base Completion with Structural and Semantic Context》(AAAI 2020) GitHub: [fig1]

Koopman operator identification library in Python

g2o: A General Framework for Graph Optimization

A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery

Prompts - Read a textfile of prompts and import into anki via ankiconnect

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Plug-n-Play Reinforcement Learning in Python with OpenAI Gym and JAX

McGill Physics Hackathon 2021: Reaction-Diffusion Models for the Generation of Biological Patterns

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.