LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Last update: Dec 03, 2022

Overview

LightSpeech

UnOfficial PyTorch implementation of LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search. This repo uses the FastSpeech 2 implementation of Espnet as a base. This repo only implements the final version of LightSpeech model not the Neural Architecture Search as mentioned in paper.

But I am able to compress only 3x (from 27 M to 7.99 M trainable parameters) not 15x.

Requirements :

All code written in Python 3.6.2 .

Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

Installing other requirements :

pip install -r requirements.txt

To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python .\nvidia_preprocessing.py -d path_of_wavs -c configs/default.yaml

For finding the min and max of F0 and Energy

python .\compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

For training

 python train_lightspeech.py --outdir etc -c configs/default.yaml -n "name"

For inference

WIP

python .\inference.py -c .\configs\default.yaml -p .\checkpoints\first_1\xyz.pyt --out output --text "ModuleList can be indexed like a regular Python list but modules it contains are properly registered."

For TorchScript Export

python export_torchscript.py -c configs/default.yaml -n fastspeech_scrip --outdir etc

Checkpoint and samples:

WIP

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Related tags

Overview

LightSpeech

Requirements :

For Preprocessing :

For training

For inference

For TorchScript Export

Checkpoint and samples:

References

Owner

Rishikesh (ऋषिकेश)

Chinese Named Entity Recognization (BiLSTM with PyTorch)

LUKE -- Language Understanding with Knowledge-based Embeddings

Pangu-Alpha for Transformers

Tool to add main subject to items on Wikidata using a WMFs CirrusSearch for named entity recognition or a manually supplied list of QIDs

Machine Psychology: Python Generated Art

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

A Python/Pytorch app for easily synthesising human voices

Semantic search for quotes.

Phrase-Based & Neural Unsupervised Machine Translation

A number of methods in order to perform Natural Language Processing on live data derived from Twitter

Torchrecipes provides a set of reproduci-able, re-usable, ready-to-run RECIPES for training different types of models, across multiple domains, on PyTorch Lightning.

code for modular summarization work published in ACL2021 by Krishna et al

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Help you discover excellent English projects and get rid of disturbing by other spoken language

Visual Automata is a Python 3 library built as a wrapper for Caleb Evans' Automata library to add more visualization features.

Extracting Summary Knowledge Graphs from Long Documents

Outreachy TFX custom component project

The RWKV Language Model

使用Mask LM预训练任务来预训练Bert模型。训练垂直领域语料的模型表征，提升下游任务的表现。

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.