Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus

PyTorch Implementation of (ACM MM'21)Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.

Requirements

See requirements in requirement.txt:

linux
python 3.6
pytorch 1.0+
librosa
json, tqdm, logging

Getting started

Apply recipe to your own dataset

Put any wav files in data directory
Edit configuration in config/config.yaml

1. Pretrain

Use our checkpoint, or
you can also train the encoder on your own here, and set the enc_model_fpath in config/config.yaml. Please set params as those in encoder/params_data and encoder/params_model.

2. Preprocess

Extract mel-spectrogram

python preprocess.py -i data/wavs -o data/feature -c config/config.yaml

-i your audio folder

-o output acoustic feature folder

-c config file

3. Train

Training conditioned on mel-spectrogram

python train.py -i data/feature -o checkpoints/ --config config/config.yaml

-i acoustic feature folder

-o directory to save checkpoints

-c config file

4. Inference

python inference.py -i data/feature -o outputs/  -c checkpoints/*.pkl -g config/config.yaml

-i acoustic feature folder

-o directory to save generated speech

-c checkpoints file

-c config file

5. Singing Voice Synthesis

For Singing Voice Synthesis:

Take modified FastSpeech 2 for mel-spectrogram synthesis
Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.

Checkpoint

Trained on OpenSinger

Acknowledgements

GE2E
FastSpeech 2
Parallel WaveGAN

Citation

@inproceedings{huang2021multi,
  title={Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus},
  author={Huang, Rongjie and Chen, Feiyang and Ren, Yi and Liu, Jinglin and Cui, Chenye and Zhao, Zhou},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={3945--3954},
  year={2021}
}

Question

Feel free to contact me at rongjiehuang@zju.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
config		config
datasets		datasets
distributed		distributed
encoder		encoder
frontend		frontend
layers		layers
losses		losses
models		models
optimizers		optimizers
utils		utils
Basic.pkl		Basic.pkl
LICENSE.txt		LICENSE.txt
README.md		README.md
inference.py		inference.py
preprocess.py		preprocess.py
pretrained1.pt		pretrained1.pt
requirements.txt		requirements.txt
train.py		train.py

License

Rongjiehuang/Multi-Singer

Folders and files

Latest commit

History

Repository files navigation

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus

Requirements

Getting started

Apply recipe to your own dataset

1. Pretrain

2. Preprocess

3. Train

4. Inference

5. Singing Voice Synthesis

Checkpoint

Acknowledgements

Citation

Question

About

Topics

Resources

License

Stars

Watchers

Forks

Languages