A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Last update: Dec 28, 2022

Related tags

Deep Learning Pytorch-MBNet

Overview

Pytorch-MBNet

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Training

To train a new model, please run train.py, the input arguments are:

--data_path: The path of the directory containing all .wav files of VCC-2018 and the train/dev/test split files (the files in ./data).
--save_dir: The path of the directory to save the trained models. Please create the directory before training.
--total_steps: The total #training step in the training.
--valid_steps: Do the validation every #(valid_steps) of training update.
--log_steps: Log the tensorboard every #(log_steps) of training update.
--update_freq: Gradient accumulation, the default value is 1 (no accumulation).

Testing

To test on VCC-2018, please run test.py, the input arguments are:

--model_path: The path to the saved model.
--idtable_path: The path to the "judge id-number" mapping table file used during training.
--step: The time step for tensorboard log, which can be the same as the training steps.
--split: The valid/test split of data to be used in the testing.

Inference

After training on the VCC data, the model can be utilized to inference on other data. The input arguments are --data_path, --model_path, --save_dir, which are similar to the above. Notice that the bias-net is not used since in this code the ground-truth judge ids are assumed to be unavailable.

The pre-trained model can be found in ./pre_trained.

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Related tags

Overview

Pytorch-MBNet

Training

Testing

Inference

Owner

Learning to Estimate Hidden Motions with Global Motion Aggregation

[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

Code for the paper: Hierarchical Reinforcement Learning With Timed Subgoals, published at NeurIPS 2021

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Semantic Segmentation Suite in TensorFlow

[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

A ssl analyzer which could analyzer target domain's certificate.

Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Over-the-Air Ensemble Inference with Model Privacy

Paper Code：A Self-adaptive Weighted Differential Evolution Approach for Large-scale Feature Selection

Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

Official Implementation for "ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement" https://arxiv.org/abs/2104.02699

Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

An implementation of the WHATWG URL Standard in JavaScript

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

[AI6122] Text Data Management & Processing