Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

Overview

Music Trees

Supplementary code for the experiments described in the 2021 ISMIR submission: Leveraging Hierarchical Structures for Few Shot Musical Instrument Recognition.

train-test splits and hierarchies.

  • For all experiments, we used the instrument-based split in /music_trees/assets/partitions/mdb-aug.json.
  • To view our Hornbostel-Sachs class hierarchy, see /music_trees/assets/taxonomies/deeper-mdb.yaml. Note that not all of the instruments on this taxonomy are used in our experiments.
  • All random taxonomies are in /music_trees/assets/taxonomies/scrambled-*.yaml

Installation

first, clone the medleydb repo and install using pip install -e:

  • medleydb from marl

Now, download the medleydb and mdb 2.0 datasets from zenodo.

install some utilities for visualizing the embedding space:

git clone https://github.com/hugofloresgarcia/embviz.git
cd embviz
pip install -e .

then, clone this repo and install with

pip install -e .

Usage

1. Generate data

Make sure the MEDLEYDB_PATH environment variable is set (see the medleydb repo for more instructions ). Then, run the generation script:

python -m music_trees.generate \
                --dataset mdb \
                --name mdb-aug \
                --example_length 1.0 \
                --augment true \
                --hop_length 0.5 \
                --sample_rate 16000 \

This will generate both augmented and unaugmented data for MedleyDB. NOTE: There was a bug in the code that disabled data augmentation silently. This bug has been left in the code for the sake of reproducibility. This is why we don't report any data augmentation in the paper, as none was applied at the time of experiments.

2. Partition data

The partition file used for all experiments is available at /music_trees/assets/partitions/mdb-aug.json.

3. Run experiments

The search script will train all models for a particular experiment. It will grab as many GPUs are available (use CUDA_VISIBLE_DEVICES to change the availability of GPUs) and train as many models as it can in parallel.

Each model will be stored under /runs/<NAME>/<VERSION>.

Arbitrary Hierarchies

python music_trees/search.py --name scrambled-tax

Height Search (note that height=0 and height=1 are the baseline and proposed model, respectively)

python music_trees/search.py --name height-v1

Loss Ablation

python music_trees/search.py --name loss-alpha

train the additional BCE baseline:

python music_trees/train.py --model_name hprotonet --height 4 --d_root 128 --loss_alpha 1 --name "flat (BCE)" --dataset mdb-aug --learning_rate 0.03 --loss_weight_fn cross-entropy

4. Evaluate

Perform evaluation on a model. Make sure to pass the path to the run that you wish to evaluate.

To evaluate a model:

python music_trees/eval.py --exp_dir <PATH_TO_RUN>/<VERSION>

Each model will store its evaluation results under /results/<NAME>/<VERSION>

5. Analyze

To compare models and generate analysis figures and tables, place of all the results folders you would like to analyze under a single folder. The resulting folder should look like this:

my_experiment/trial1/version_0
my_experiment/trial2/version_0
my_experiment/trial3/version_0

Then, run analysis using

python music_trees analyze.py my_experiment   <OUTPUT_NAME> 

the figures will be created under /analysis/<OUTPUT_NAME>

To generate paper-ready figures, see scripts/figures.ipynb.

Owner
Hugo Flores García
PhD @interactiveaudiolab
Hugo Flores García
An educational tool to introduce AI planning concepts using mobile manipulator robots.

JEDAI Explains Decision-Making AI Virtual Machine Image The recommended way of using JEDAI is to use pre-configured Virtual Machine image that is avai

Autonomous Agents and Intelligent Robots 13 Nov 15, 2022
A Simple Long-Tailed Rocognition Baseline via Vision-Language Model

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

Teli Ma 4 Jan 20, 2022
A PyTorch Implementation of SphereFace.

SphereFace A PyTorch Implementation of SphereFace. The code can be trained on CASIA-Webface and the best accuracy on LFW is 99.22%. SphereFace: Deep H

carwin 685 Dec 09, 2022
A GridMixup augmentation, inspired by GridMask and CutMix

GridMixup A GridMixup augmentation, inspired by GridMask and CutMix Easy install pip install git+https://github.com/IlyaDobrynin/GridMixup.git Overvie

IlyaDo 42 Dec 28, 2022
Collection of generative models in Tensorflow

tensorflow-generative-model-collections Tensorflow implementation of various GANs and VAEs. Related Repositories Pytorch version Pytorch version of th

3.8k Dec 30, 2022
What can linearized neural networks actually say about generalization?

What can linearized neural networks actually say about generalization? This is the source code to reproduce the experiments of the NeurIPS 2021 paper

gortizji 11 Dec 09, 2022
Learning RAW-to-sRGB Mappings with Inaccurately Aligned Supervision (ICCV 2021)

Learning RAW-to-sRGB Mappings with Inaccurately Aligned Supervision (ICCV 2021) PyTorch implementation of Learning RAW-to-sRGB Mappings with Inaccurat

Zhilu Zhang 53 Dec 20, 2022
DeepLab resnet v2 model in pytorch

pytorch-deeplab-resnet DeepLab resnet v2 model implementation in pytorch. The architecture of deepLab-ResNet has been replicated exactly as it is from

Isht Dwivedi 601 Dec 22, 2022
Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation

Unrestricted Facial Geometry Reconstruction Using Image-to-Image Translation [Arxiv] [Video] Evaluation code for Unrestricted Facial Geometry Reconstr

Matan Sela 242 Dec 30, 2022
Official implementation of "Robust channel-wise illumination estimation"

This repository provides the official implementation of "Robust channel-wise illumination estimation." accepted in BMVC (2021).

Firas Laakom 4 Nov 08, 2022
This is the code used in the paper "Entity Embeddings of Categorical Variables".

This is the code used in the paper "Entity Embeddings of Categorical Variables". If you want to get the original version of the code used for the Kagg

Cheng Guo 845 Nov 29, 2022
Semi-supervised Implicit Scene Completion from Sparse LiDAR

Semi-supervised Implicit Scene Completion from Sparse LiDAR Paper Created by Pengfei Li, Yongliang Shi, Tianyu Liu, Hao Zhao, Guyue Zhou and YA-QIN ZH

114 Nov 30, 2022
NAACL2021 - COIL Contextualized Lexical Retriever

COIL Repo for our NAACL paper, COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List. The code covers learning

Luyu Gao 108 Dec 31, 2022
CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation

CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation We propose a novel approach to translate unpaired contrast computed

Nicolae Catalin Ristea 13 Jan 02, 2023
Tool for installing and updating MiSTer cores and other files

MiSTer Downloader This tool installs and updates all the cores and other extra files for your MiSTer. It also updates the menu core, the MiSTer firmwa

72 Dec 24, 2022
This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities

MLOps with Vertex AI This example implements the end-to-end MLOps process using Vertex AI platform and Smart Analytics technology capabilities. The ex

Google Cloud Platform 238 Dec 21, 2022
ivadomed is an integrated framework for medical image analysis with deep learning.

Repository on the collaborative IVADO medical imaging project between the Mila and NeuroPoly labs.

144 Dec 19, 2022
Grammar Induction using a Template Tree Approach

Gitta Gitta ("Grammar Induction using a Template Tree Approach") is a method for inducing context-free grammars. It performs particularly well on data

Thomas Winters 36 Nov 15, 2022
Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation

Flexible-CLmser: Regularized Feedback Connections for Biomedical Image Segmentation The skip connections in U-Net pass features from the levels of enc

Boheng Cao 1 Dec 29, 2021
Anime Face Detector using mmdet and mmpose

Anime Face Detector This is an anime face detector using mmdetection and mmpose. (To avoid copyright issues, I use generated images by the TADNE model

198 Jan 07, 2023