Voice Conversion by CycleGAN (语音克隆/语音转换):CycleGAN-VC3

Overview

CycleGAN-VC3-PyTorch

standard-readme compliant Donate

中文说明 | English


This code is a PyTorch implementation for paper: CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion, a nice work on Voice-Conversion/Voice Cloning.

  • Dataset
    • VC
  • Usage
    • Training
    • Example
  • Demo
  • Reference

CycleGAN-VC3

Project Page

Non-parallel voice conversion (VC) is a technique for learning mappings between source and target speeches without using a parallel corpus. Recently, CycleGAN-VC [3] and CycleGAN-VC2 [2] have shown promising results regarding this problem and have been widely used as benchmark methods. However, owing to the ambiguity of the effectiveness of CycleGAN-VC/VC2 for mel-spectrogram conversion, they are typically used for mel-cepstrum conversion even when comparative methods employ mel-spectrogram as a conversion target. To address this, we examined the applicability of CycleGAN-VC/VC2 to mel-spectrogram conversion. Through initial experiments, we discovered that their direct applications compromised the time-frequency structure that should be preserved during conversion. To remedy this, we propose CycleGAN-VC3, an improvement of CycleGAN-VC2 that incorporates time-frequency adaptive normalization (TFAN). Using TFAN, we can adjust the scale and bias of the converted features while reflecting the time-frequency structure of the source mel-spectrogram. We evaluated CycleGAN-VC3 on inter-gender and intra-gender non-parallel VC. A subjective evaluation of naturalness and similarity showed that for every VC pair, CycleGAN-VC3 outperforms or is competitive with the two types of CycleGAN-VC2, one of which was applied to mel-cepstrum and the other to mel-spectrogram.

network comparison Figure 1. We developed time-frequency adaptive normalization (TFAN), which extends instance normalization [5] so that the affine parameters become element-dependent and are determined according to an entire input mel-spectrogram.


This repository contains:

  1. TFAN module code which implemented the TFAN module
  2. model code which implemented the model network.
  3. audio preprocessing script you can use to create cache for training data.
  4. training scripts to train the model.

Table of Contents


Requirement

pip install -r requirements.txt

Usage


Reference

  1. CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion. Paper, Project
  2. CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion. Paper, Project
  3. Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks. Paper, Project
  4. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Paper, Project, Code
  5. Image-to-Image Translation with Conditional Adversarial Nets. Paper, Project, Code

Donation

If this project help you reduce time to develop, you can give me a cup of coffee :)

AliPay(支付宝)

ali_pay

WechatPay(微信)

wechat_pay

paypal


License

MIT © Kun

Owner
Kun Ma
artizan
Kun Ma
Python Implementation of Chess Playing AI with variable difficulty

Chess AI with variable difficulty level implemented using the MiniMax AB-Pruning Algorithm

Ali Imran 7 Feb 20, 2022
This repository contains part of the code used to make the images visible in the article "How does an AI Imagine the Universe?" published on Towards Data Science.

Generative Adversarial Network - Generating Universe This repository contains part of the code used to make the images visible in the article "How doe

Davide Coccomini 9 Dec 18, 2022
Implementations of CNNs, RNNs, GANs, etc

Tensorflow Programs and Tutorials This repository will contain Tensorflow tutorials on a lot of the most popular deep learning concepts. It'll also co

Adit Deshpande 1k Dec 30, 2022
A simple code to convert image format and channel as well as resizing and renaming multiple images.

Rename-Resize-and-convert-multiple-images A simple code to convert image format and channel as well as resizing and renaming multiple images. This cod

Happy N. Monday 3 Feb 15, 2022
Multi-Glimpse Network With Python

Multi-Glimpse Network Our code requires Python ≥ 3.8 Installation For example, venv + pip: $ python3 -m venv env $ source env/bin/activate (env) $ pyt

9 May 10, 2022
An abstraction layer for mathematical optimization solvers.

MathOptInterface Documentation Build Status Social An abstraction layer for mathematical optimization solvers. Replaces MathProgBase. Citing MathOptIn

JuMP-dev 284 Jan 04, 2023
level1-image-classification-level1-recsys-09 created by GitHub Classroom

level1-image-classification-level1-recsys-09 ❗ 주제 설명 COVID-19 Pandemic 상황 속 마스크 착용 유무 판단 시스템 구축 마스크 착용 여부, 성별, 나이 총 세가지 기준에 따라 총 18개의 class로 구분하는 모델 ?

6 Mar 17, 2022
DeepLearning Anomalies Detection with Bluetooth Sensor Data

Final Year Project. Constructing models to create offline anomalies detection using Travel Time Data collected from Bluetooth sensors along the route.

1 Jan 10, 2022
Final project for machine learning (CSC 590). Detection of hepatitis C and progression through blood samples.

Hepatitis C Blood Based Detection Final project for machine learning (CSC 590). Dataset from Kaggle. Using data from previous hepatitis C blood panels

Jennefer Maldonado 1 Dec 28, 2021
This is the repository of our article published on MDPI Entropy "Feature Selection for Recommender Systems with Quantum Computing".

Collaborative-driven Quantum Feature Selection This repository was developed by Riccardo Nembrini, PhD student at Politecnico di Milano. See the websi

Quantum Computing Lab @ Politecnico di Milano 10 Apr 21, 2022
Benchmark VAE - Library for Variational Autoencoder benchmarking

Documentation pythae This library implements some of the most common (Variational) Autoencoder models. In particular it provides the possibility to pe

1.1k Jan 02, 2023
An Open-Source Toolkit for Prompt-Learning.

An Open-Source Framework for Prompt-learning. Overview • Installation • How To Use • Docs • Paper • Citation • What's New? Nov 2021: Now we have relea

THUNLP 2.3k Jan 07, 2023
Unified API to facilitate usage of pre-trained "perceptor" models, a la CLIP

mmc installation git clone https://github.com/dmarx/Multi-Modal-Comparators cd 'Multi-Modal-Comparators' pip install poetry poetry build pip install d

David Marx 37 Nov 25, 2022
On the Adversarial Robustness of Visual Transformer

On the Adversarial Robustness of Visual Transformer Code for our paper "On the Adversarial Robustness of Visual Transformers"

Rulin Shao 35 Dec 14, 2022
UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring

UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring Code Summary aggregate.py: this script aggr

1 Dec 28, 2021
Sequential model-based optimization with a `scipy.optimize` interface

Scikit-Optimize Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions. It implements

Scikit-Optimize 2.5k Jan 04, 2023
Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

JSON 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Json2Xml t

Nguyễn Trường Lâu 6 Aug 22, 2022
A pytorch &keras implementation and demo of Fastformer.

Fastformer Notes from the authors Pytorch/Keras implementation of Fastformer. The keras version only includes the core fastformer attention part. The

153 Dec 28, 2022
HMLET (Hybrid-Method-of-Linear-and-non-linEar-collaborative-filTering-method)

Methods HMLET (Hybrid-Method-of-Linear-and-non-linEar-collaborative-filTering-method) Dynamically selecting the best propagation method for each node

Yong 7 Dec 18, 2022
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

Rowel Atienza 198 Dec 27, 2022