An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Last update: Jun 16, 2022

Related tags

Computer Vision AutoVC

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

This is an unofficial implementation of AutoVC based on the official one.

The repository is still under construction, so some details may be missing or incomplete.

Preprocessing

python preprocess.py <data_path> <save_path> <encoder_path> [--seg_len seg] [--n_workers workers]

Training

python train.py <config> <data_path> <save_path> [--n_steps steps] [--save_steps save] [--log_steps log] [--batch_size batch] [--seg_len seg]

Reference

Please cite the paper if you find it useful.

@InProceedings{pmlr-v97-qian19c,
  title = {{A}uto{VC}: Zero-Shot Voice Style Transfer with Only Autoencoder Loss},
  author = {Qian, Kaizhi and Zhang, Yang and Chang, Shiyu and Yang, Xuesong and Hasegawa-Johnson, Mark},
  pages = {5210--5219},
  year = {2019},
  editor = {Kamalika Chaudhuri and Ruslan Salakhutdinov},
  volume = {97},
  series = {Proceedings of Machine Learning Research},
  address = {Long Beach, California, USA},
  month = {09--15 Jun},
  publisher = {PMLR},
  pdf = {http://proceedings.mlr.press/v97/qian19c/qian19c.pdf},
  url = {http://proceedings.mlr.press/v97/qian19c.html}
}

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Related tags

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Preprocessing

Training

Reference

Owner

Chien-yu Huang

Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

The first open-source library that detects the font of a text in a image.

The code for “Oriented RepPoints for Aerail Object Detection”

list all open dataset about ocr.

Select range and every time the screen changes, OCR is activated.

color detection using python

Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

Python rubik's cube solver

TextBoxes re-implement using tensorflow

Papers, Datasets, Algorithms, SOTA for STR. Long-time Maintaining

A little but useful tool to explore OCR data extracted with `pytesseract` and `opencv`

BNF Globalization Code (CVPR 2016)

A python programusing Tkinter graphics library to randomize questions and answers contained in text files

A simple Digits Recogniser made in Python

A webcam-based 3x3x3 rubik's cube solver written in Python 3 and OpenCV.

Assignment work with webcam

Scene text detection and recognition based on Extremal Region(ER)

Hand gesture detection project with aweome UI implementation.

a micro OCR network with 0.07mb params.

Deskewing images with slanted content