Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Overview

Y-Net

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021

Project page: ipcv.github.io/Acappella/
Paper: Arxiv, BMVC (not available yet)

Running a demo / Y-Net Inference

We provide simple functions to load models with pre-trained weights. Steps:

  1. Clone the repo or download y-net>VnBSS>models (models can run as a standalone package)
  2. Load a model:
from VnBSS import y_net_gr # or from models import y_net_gr 
model = y_net_gr(n=1)

Check a demo fully working:
Open In Colab

Citation

@inproceedings{acappella,
    author    = {Juan F. Montesinos and
                 Venkatesh S. Kadandale and
                 Gloria Haro},
    title     = {A cappella: Audio-visual Singing VoiceSeparation},
    booktitle = {British Machine Vision Conference (BMVC)},
    year      = {2021},

}

Repository under construction .
.
.
.
.
.
.
.

Training / Using DEV code

###Training The most difficult part is to prepare the dataset as everything is builded upon a very specific format.
To run training:
python run.py -m model_name --workname experiment_name --arxiv_path directory_of_experiments --pretrained_from path_pret_weights
You can inspect the argparse at default.py>argparse_default.
Possible model names are: y_net_g, y_net_gr, y_net_m,y_net_r,u_net,llcp

Testing

  1. Go to manuscript_scripts and replace checkpoint paths by yours in the testing scripts.
  2. Run: bash manuscript_scripts/test_gr_r.sh
  3. Replace the paths of manuscript_scripts/auto_metrics.py by your experiment_directory path.
  4. Run: python manuscript_scripts/auto_metrics.py to visualise results.

It's a complicated framework. HELP!

The best option to run the framework is to debug! Having a runable code helps to see input shapes, dataflow and to run line by line. Download The circle of life demo with the files already processed. It will act like a dataset of 6 samples. You can download it from Google Drive 1.1 Gb.

  1. Unzip the file
  2. run python run.py -m y_net_gr (for example)

Everything has been configured to run by default this way.

The model

Each effective model is wrapped by a nn.Module which takes care of computing the STFT, the mask, returning the waveform etcetera... This wrapper can be found at VnBSS>models>y_net.py>YNet. To get rid of this you can simply inherit the class, take minimum layers and keep the core_forward method, which is the inference step without the miscelanea.

FAQs

  1. How to change the optimizer's hyperparameters?
    Go to config>optimizer.json
  2. How to change clip duration, video framerate, STFT parameters or audio samplerate?
    Go to config>__init__.py
  3. How to change the batch size or the amount of epochs?
    Go to config>hyptrs.json
  4. How to dump predictions from the training and test set
    Go to default.py. Modify DUMP_FILES (can be controlled at a subset level). force argument skips the iteration-wise conditions and dumps for every single network prediction.
  5. Is tensorboard enabled?
    Yes, you will find tensorboard records at your_experiment_directory/used_workname/tensorboard
  6. Can I resume an experiment?
    Yes, if you set exactly the same experiment folder and workname, the system will detect it and will resume from there.
  7. I'm trying to resume but found AssertionError If there is an exception before running the model
  8. How to change the amount of layers of U-Net
    U-net is build dynamically given a list of layers per block as shown in models>__init__.py from outer to inner blocks.
  9. How to modify the default network values?
    The json file config>net_cfg.json overwrites any default configuration from the model.
Owner
Juan F. Montesinos
PhD student at Pompeu Fabra university Barcelona
Juan F. Montesinos
Frescobaldi LilyPond Editor

README for Frescobaldi Homepage: http://www.frescobaldi.org/ Main author: Wilbert Berendsen Frescobaldi is a LilyPond sheet music text editor. It aims

Frescobaldi 600 Dec 29, 2022
This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

OneBit 1 Nov 05, 2021
A collection of python scripts for extracting and analyzing acoustics from audio files.

pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe

Tim 74 Dec 26, 2022
python script for getting mp3 files from yaoutube playlist

mp3-from-youtube-playlist python script for getting mp3 files from youtube playlist. Do your non-tech brown relatives ask you for downloading music fr

Shuhan Mirza 7 Oct 19, 2022
Music player and music library manager for Linux, Windows, and macOS

Ex Falso / Quod Libet - A Music Library / Editor / Player Quod Libet is a music management program. It provides several different ways to view your au

Quod Libet 1.2k Jan 07, 2023
Welcome to Nexus. Your personal virtual assistant

AI Voice Assistant Welcome to Nexus voice assistant Description Have you ever heard of voice assistants like Cortana, Siri, Google assistant, and Alex

Mustafah Zacs 1 Jan 10, 2022
This bot can stream audio or video files and urls in telegram voice chats

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) 🎯 Follow me and star this repo for more telegram bot

WiskeyWorm 4 Oct 09, 2022
A simple music player, powered by Python, utilising various libraries such as Tkinter and Pygame

A simple music player, powered by Python, utilising various libraries such as Tkinter and Pygame

PotentialCoding 2 May 12, 2022
Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums)

LAKH MuseNet MIDI Dataset Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums) Bonus: Choir on Channel 10 Please CC

Alex 6 Nov 20, 2022
Noinoi music is smoothly playing music on voice chat of telegram.

NOINOI MUSIC BOT ✨ Features Music & Video stream support MultiChat support Playlist & Queue support Skip, Pause, Resume, Stop feature Music & Video do

2 Feb 13, 2022
Multi-Track Music Generation with the Transfomer and the Johann Sebastian Bach Chorales dataset

MMM: Exploring Conditional Multi-Track Music Generation with the Transformer and the Johann Sebastian Bach Chorales Dataset. Implementation of the pap

102 Dec 08, 2022
C++ library for audio and music analysis, description and synthesis, including Python bindings

Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.

Music Technology Group - Universitat Pompeu Fabra 2.3k Jan 03, 2023
Supysonic is a Python implementation of the Subsonic server API.

Supysonic Supysonic is a Python implementation of the Subsonic server API. Current supported features are: browsing (by folders or tags) streaming of

Alban 228 Nov 19, 2022
XA Music Player - Telegram Music Bot

XA Music Player Requirements 📝 FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) MongoDB (3.12.1) 2nd Telegram Ac

RexAshh 3 Jun 30, 2022
Small Python application that links a Digico console and Reaper, handling automatic marker insertion and tracking.

Digico-Reaper-Link This is a small GUI based helper application designed to help with using Digico's Copy Audio function with a Reaper DAW used for re

Justin Stasiw 10 Oct 24, 2022
This is an OverPowered Vc Music Player! Will work for you and play music in Voice Chatz

VcPlayer This is an OverPowered Vc Music Player! Will work for you and play music in Voice Chatz Telegram Voice-Chat Bot [PyTGCalls] ⇝ Requirements ⇜

1 Dec 20, 2021
A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

rfsoapyfile A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream). The script is threaded fo

4 Dec 19, 2022
?️ Open Source Audio Matching and Mastering

Matching + Mastering = ❤️ Matchering 2.0 is a novel Containerized Web Application and Python Library for audio matching and mastering. It follows a si

Sergey Grishakov 781 Jan 05, 2023
The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases.

The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases. The cross-section of the throat is less than the cross-section of the inlet pi

Shankar Mahadevan L 1 Dec 03, 2021
A Python library and tools AUCTUS A6 based radios.

A Python library and tools AUCTUS A6 based radios.

Jonathan Hart 6 Nov 23, 2022