The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Last update: Jan 03, 2023

Related tags

Deep Learning WSRGlow

Overview

WSRGlow

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio samples can be found here.

Feel free to create issues or send an email to [email protected] if you have problems running the code.

Before running the code, you need to install the dependicies by pip install -r requirements.txt.

The configs for model architecture and training scheme is saved in config.yaml. You can overwrite some of the attributes by adding the --hparams flag when running a command. The general way to run a python script is

python $SRC$ --config $CONFIG$ --hparams $KEY1$=$VALUE1$,$KEY2$=$VALUE2$,...

See hparams.py for more details.

To prepare data

Before training, you need to binarize the data first. The raw wav files should be put in the hparams['raw_data_path']. The binarized data would be put in the hparams['binary_data_path'].

Specifically, for the VCTK corpus, the file structure should be like

.
|--data
    |--raw
        |--VCTK-Corpus
            |--wav48
                |--$WAVS
|--checkpoints
    |--wsrglow

where the model checkpoints are in checkpoints/wsrglow.

The command to binarize is

python binarizer.py --config config.yaml

To modify the architecture of the model

The current WSRGlow model in model.py is designed for x4 super-resolution and takes waveform, spectrogram and phase information as input.

To train

Run python train.py --config config.yaml on a GPU.

To infer

Change the code in infer.py to specify the checkpoint you want to load and the sample inputs you want to use for inference. Run python infer.py --config config.yaml on a GPU, modify the code for the correct path of checkpoints and wav files.

The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

Related tags

Overview

WSRGlow

To prepare data

To modify the architecture of the model

To train

To infer

Owner

Kexun Zhang

ML From Scratch

Apollo optimizer in tensorflow

VR-Caps: A Virtual Environment for Active Capsule Endoscopy

Differentiable architecture search for convolutional and recurrent networks

Contextual Attention Localization for Offline Handwritten Text Recognition

Multivariate Boosted TRee

Degree-Quant: Quantization-Aware Training for Graph Neural Networks.

Evolution Strategies in PyTorch

Implementation of our paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Research on Tabular Deep Learning (Python package & papers)

Allele-specific pipeline for unbiased read mapping(WIP), QTL discovery(WIP), and allelic-imbalance analysis

Relaxed-machines - explorations in neuro-symbolic differentiable interpreters

Deep Inertial Prediction (DIPr)

Code implementing "Improving Deep Learning Interpretability by Saliency Guided Training"

Project for tracking occupancy in Tel-Aviv parking lots.

MPLP: Metapath-Based Label Propagation for Heterogenous Graphs

计算机视觉中用到的注意力模块和其他即插即用模块PyTorch Implementation Collection of Attention Module and Plug&Play Module

Self-supervised learning (SSL) is a method of machine learning

A general, feasible, and extensible framework for classification tasks.