An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Last update: Jun 16, 2022

Related tags

Computer Vision AutoVC

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

This is an unofficial implementation of AutoVC based on the official one.

The repository is still under construction, so some details may be missing or incomplete.

Preprocessing

python preprocess.py <data_path> <save_path> <encoder_path> [--seg_len seg] [--n_workers workers]

Training

python train.py <config> <data_path> <save_path> [--n_steps steps] [--save_steps save] [--log_steps log] [--batch_size batch] [--seg_len seg]

Reference

Please cite the paper if you find it useful.

@InProceedings{pmlr-v97-qian19c,
  title = {{A}uto{VC}: Zero-Shot Voice Style Transfer with Only Autoencoder Loss},
  author = {Qian, Kaizhi and Zhang, Yang and Chang, Shiyu and Yang, Xuesong and Hasegawa-Johnson, Mark},
  pages = {5210--5219},
  year = {2019},
  editor = {Kamalika Chaudhuri and Ruslan Salakhutdinov},
  volume = {97},
  series = {Proceedings of Machine Learning Research},
  address = {Long Beach, California, USA},
  month = {09--15 Jun},
  publisher = {PMLR},
  pdf = {http://proceedings.mlr.press/v97/qian19c/qian19c.pdf},
  url = {http://proceedings.mlr.press/v97/qian19c.html}
}

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Related tags

Overview

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Preprocessing

Training

Reference

Owner

Chien-yu Huang

Comparison-of-OCR (KerasOCR, PyTesseract,EasyOCR)

Fast style transfer

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

A Vietnamese personal card OCR website built with Django.

Memory tests solver with using OpenCV

a micro OCR network with 0.07mb params.

This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.

Unofficial implementation of "TableNet: Deep Learning model for end-to-end Table detection and Tabular data extraction from Scanned Document Images"

MONAI Label is a server-client system that facilitates interactive medical image annotation by using AI.

The virtual calculator will be above the live streaming from your camera

Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

An official PyTorch implementation of the paper "Learning by Aligning: Visible-Infrared Person Re-identification using Cross-Modal Correspondences", ICCV 2021.

An organized collection of tutorials and projects created for aspriring computer vision students.

Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

fishington.io bot with OpenCV and NumPy

Machine Leaning applied to denoise images to improve OCR Accuracy

Handwritten Number Recognition using CNN and Character Segmentation

PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application in real time.

A community-supported supercharged version of paperless: scan, index and archive all your physical documents