Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

Overview

Init

Use VITS and Opencpop to develop singing voice synthesis; Maybe it will VISinger.

本项目基于

https://github.com/jaywalnut310/vits
https://github.com/SJTMusicTeam/Muskits/
https://wenet.org.cn/opencpop/ 歌声数据

使用muskit数据预处理,获得初步数据

cd egs/opencpop/svs1/
./local/data.sh

VISinger_data
--lable
--midi_dump
--wav_dump

采样率转换

python wave_16k.py
--wav_dump
--wav_dump_16k

使用muskit将数据处理成vits的格式

1, 将lable进行拆分
python muskit/data_label_single.py

label_dump,midi_dump,wav_dump:一个文件一个标注

注意:label和lable的混用(两个单词都是对的)

VISinger_data
--label_dump
--midi_dump
--wav_dump
--wav_dump_16k

2, 将label和midi处理为frame对应的发音单元和音符(基音)
python muskit/data_format_vits.py
VISinger_data
--label_vits
--label_dump
--midi_dump
--wav_dump
--wav_dump_16k

3, 生成VITS需要的files,并分割为train和dev,test不需要(可以手动设计)
python muskit/data_format_vits.py

vits_file.txt 中的内容格式:wave path|label path|pitch path;

cp vits_file.txt VISinger/filelists/
cd VISinger/

python preprocess.py 分割为train和dev

VITS训练

cd VISinger
CUDA_VISIBLE_DEVICES=0 python train.py -c configs/singing_base.json -m singing_base 2>exit_error.log;cat exit_error.log
python vsinging_infer.py

使用16K节约内存,方便模型修改

编辑midi,然后测试

cd ../;python muskit/infer_midi.py;cd -;python vsinging_edit.py

LOSS值 MEL谱

样例音频

vits_singing_样例.wav

You might also like...
In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.
In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台 本项目是一个简易的人脸检测识别平台,提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js,后端采用 pytorch,

Official codebase used to develop Vision Transformer, MLP-Mixer, LiT and more.

Big Vision This codebase is designed for training large-scale vision models on Cloud TPU VMs. It is based on Jax/Flax libraries, and uses tf.data and

Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectrum sensing.

Deep-Learning-based-Spectrum-Sensing Use MATLAB to simulate the signal and extract features. Use PyTorch to build and train deep network to do spectru

Transfer style api - An API to use with Tranfer Style App, where you can use two image and transfer the style

Transfer Style API It's an API to use with Tranfer Style App, where you can use

Voice of Pajlada with model and weights.

Pajlada TTS Stripped down version of ForwardTacotron (https://github.com/as-ideas/ForwardTacotron) with pretrained weights for Pajlada's (https://gith

A voice recognition assistant similar to amazon alexa, siri and google assistant.
A voice recognition assistant similar to amazon alexa, siri and google assistant.

kenyan-Siri Build an Artificial Assistant Full tutorial (video) To watch the tutorial, click on the image below Installation For windows users (run th

An implementation of
An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

this is a lite easy to use virtual keyboard project for anyone to use
this is a lite easy to use virtual keyboard project for anyone to use

virtual_Keyboard this is a lite easy to use virtual keyboard project for anyone to use motivation I made this for this year's recruitment for RobEn AA

A collection of easy-to-use, ready-to-use, interesting deep neural network models
A collection of easy-to-use, ready-to-use, interesting deep neural network models

Interesting and reproducible research works should be conserved. This repository wraps a collection of deep neural network models into a simple and un

Comments
  • couple of questions

    couple of questions

    Hello how are you ! very cool stuff you have here ,I can clearly see you love singing voice synthesis (SVS) from your forks and repos !! i wanted to ask is that a fully working Visingerr or is it a try from you to make it to sing , like can it be tested on a custom English data and have like results the same as or near the demo in the paper. Also do you have like other samples i can hear , i know that you tested it on opencpop that has almost 5.2 hours of singing data , and also in the paper they trained Visingerr for 600k iterations right ? how many iterations did you achieve on the opencpop to get the result linked below (vits_singing_样例.wav). to be honest i thought vits is data hungry like tacotron2 or fastspeech (aka needs a lot of data to get great results) , that opencpop result of your is so impressive for 5.2 hours data , i also wonder if you lowered the sample rate of opencpop from 44.1 KHz to 22KHz as i heard 44.1 KHz takes alot of time to train x10 the time needed.

    迫不及待地想知道你的消息 :)

    opened by dutchsing009 5
  • 问题

    问题

    python prepare/data_vits.py 输出 1,../VISinger_data/label_vits/XXX._label.npy|XXX_score.npy|XXX_pitch.npy|XXX_slurs.npy 2,filelists/vits_file.txt 内容格式:wave path|label path|score path|pitch path|slurs path;

    请问1 2这两步是怎么操作?

    opened by baipeng0110 3
  • 训练结果

    训练结果

    目前模型缺乏时长预测模型和基音预测模型; 训练语料中的句子修改歌词的效果;

    原歌词:雨淋湿了天空灰得更讲究

    https://user-images.githubusercontent.com/16432329/164953151-4c2513cb-f336-416b-8f04-604f13e63368.MP4

    修改歌词:你闹够了没有让我更难受

    https://user-images.githubusercontent.com/16432329/164953155-16c72670-cc89-40bc-99fe-42781c9dcdc0.MP4

    help wanted 
    opened by MaxMax2016 0
  • About release models and VISinger

    About release models and VISinger

    Hi

    This is a fantastic project that I have ever seen.

    Could you please share the released model? As on the inference step, it is said that "using the released model"

    Also, is there any plan to implement the VISinger model?

    Thank you!

    opened by shiyanpei0826 1
Owner
AmorTX
Speech
AmorTX
Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

JSON 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Json2Xml t

Nguyễn Trường Lâu 6 Aug 22, 2022
Genetic feature selection module for scikit-learn

sklearn-genetic Genetic feature selection module for scikit-learn Genetic algorithms mimic the process of natural selection to search for optimal valu

Manuel Calzolari 260 Dec 14, 2022
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Cheng Zhang 66 Nov 16, 2022
Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

Cornelius Roemer 24 Oct 26, 2022
Image to Image translation, image generataton, few shot learning

Semi-supervised Learning for Few-shot Image-to-Image Translation [paper] Abstract: In the last few years, unpaired image-to-image translation has witn

yaxingwang 49 Nov 18, 2022
A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

Hyunsoo Cho 1 Dec 20, 2021
Convnet transfer - Code for paper How transferable are features in deep neural networks?

How transferable are features in deep neural networks? This repository contains source code necessary to reproduce the results presented in the follow

Jason Yosinski 143 Sep 13, 2022
An Open-Source Tool for Automatic Disease Diagnosis..

OpenMedicalChatbox An Open-Source Package for Automatic Disease Diagnosis. Overview Due to the lack of open source for existing RL-base automated diag

8 Nov 08, 2022
Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

ReLU-GP Residual (RGPR) This repository contains code for reproducing the following NeurIPS 2021 paper: @inproceedings{kristiadi2021infinite, title=

Agustinus Kristiadi 4 Dec 26, 2021
Justmagic - Use a function as a method with this mystic script, like in Nim

justmagic Use a function as a method with this mystic script, like in Nim. Just

witer33 8 Oct 08, 2022
A Blender python script for getting asset browser custom preview images for objects and collections.

asset_snapshot A Blender python script for getting asset browser custom preview images for objects and collections. Installation: Click the code butto

Johnny Matthews 44 Nov 29, 2022
Code for intrusion detection system (IDS) development using CNN models and transfer learning

Intrusion-Detection-System-Using-CNN-and-Transfer-Learning This is the code for the paper entitled "A Transfer Learning and Optimized CNN Based Intrus

Western OC2 Lab 38 Dec 12, 2022
Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Intro Real-time object detection and classification. Paper: version 1, version 2. Read more about YOLO (in darknet) and download weight files here. In

Trieu 6.1k Jan 04, 2023
Repo for flood prediction using LSTMs and HAND

Abstract Every year, floods cause billions of dollars’ worth of damages to life, crops, and property. With a proper early flood warning system in plac

1 Oct 27, 2021
Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022) Please cite "Independent SE(3)-Equivar

Octavian Ganea 154 Jan 02, 2023
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Code in both PyTorch and TensorFlow

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Jan 06, 2023
PyTorch experiments with the Zalando fashion-mnist dataset

zalando-pytorch PyTorch experiments with the Zalando fashion-mnist dataset Project Organization ├── LICENSE ├── Makefile - Makefile with co

Federico Baldassarre 31 Sep 25, 2021
MPRNet-Cloud-removal: Progressive cloud removal

MPRNet-Cloud-removal Progressive cloud removal Requirements 1.Pytorch = 1.0 2.Python 3 3.NVIDIA GPU + CUDA 9.0 4.Tensorboard Installation 1.Clone the

Semi 95 Dec 18, 2022
Flax is a neural network ecosystem for JAX that is designed for flexibility.

Flax: A neural network library and ecosystem for JAX designed for flexibility Overview | Quick install | What does Flax look like? | Documentation See

Google 3.9k Jan 02, 2023
OMNIVORE is a single vision model for many different visual modalities

Omnivore: A Single Model for Many Visual Modalities [paper][website] OMNIVORE is a single vision model for many different visual modalities. It learns

Meta Research 451 Dec 27, 2022