Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Last update: Jan 01, 2023

Overview

RAVE: Realtime Audio Variational autoEncoder

Official implementation of RAVE: A variational autoencoder for fast and high-quality neural audio synthesis (article link) by Antoine Caillon and Philippe Esling.

If you use RAVE as a part of a music performance or installation, be sure to cite either this repository or the article !

Installation

RAVE needs python 3.9. Install the dependencies using

pip install -r requirements.txt

Detailed instructions to setup a training station for this project are available here.

Preprocessing

RAVE comes with two command line utilities, resample and duration. resample allows to pre-process (silence removal, loudness normalization) and augment (compression) an entire directory of audio files (.mp3, .aiff, .opus, .wav, .aac). duration prints out the total duration of a .wav folder.

Training

Both RAVE and the prior model are available in this repo. For most users we recommand to use the cli_helper.py script, since it will generate a set of instructions allowing the training and export of both RAVE and the prior model on a specific dataset.

python cli_helper.py

However, if you want to customize even more your training, you can use the provided train_{rave, prior}.py and export_{rave, prior}.py scripts manually.

Reconstructing audio

Once trained, you can reconstruct an entire folder containing wav files using

python reconstruct.py --ckpt /path/to/checkpoint --wav-folder /path/to/wav/folder

You can also export RAVE to a torchscript file using export_rave.py and use the encode and decode methods on tensors.

Realtime usage

UPDATE

If you want to use the realtime mode, you should update your dependencies !

pip install -r requirements.txt

RAVE and the prior model can be used in realtime on live audio streams, allowing creative interactions with both models.

nn~

RAVE is compatible with the nn~ max/msp and PureData external.

An audio example of the prior sampling patch is available in the docs/ folder.

RAVE vst

You can also use RAVE as a VST audio plugin using the RAVE vst !

Discussion

If you have questions, want to share your experience with RAVE or share musical pieces done with the model, you can use the Discussion tab !

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Related tags

Overview

RAVE: Realtime Audio Variational autoEncoder

Installation

Preprocessing

Training

Reconstructing audio

Realtime usage

nn~

RAVE vst

Discussion

Owner

ACIDS

An Inverse Kinematics library aiming performance and modularity

Deep Surface Reconstruction from Point Clouds with Visibility Information

This program automatically runs Python code copied in clipboard

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation.

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

A small fun project using python OpenCV, mediapipe, and pydirectinput

3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces (ICCV 2021)

Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

paper: Hyperspectral Remote Sensing Image Classification Using Deep Convolutional Capsule Network

An end-to-end PyTorch framework for image and video classification

Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K

Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

a simple, efficient, and intuitive text editor

Reduce end to end training time from days to hours (or hours to minutes), and energy requirements/costs by an order of magnitude using coresets and data selection.

Automatic 2D-to-3D Video Conversion with CNNs

ETMO: Evolutionary Transfer Multiobjective Optimization

Facial detection, landmark tracking and expression transfer library for Windows, Linux and Mac