Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

Overview

Text2Music Emotion Embedding

Text-to-Music Retrieval using Pre-defined/Data-driven Emotion Embeddings

Reference

Emotion Embedding Spaces for Matching Music to Stories, ISMIR 2021 [paper]

-- Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, and Xavier Serra

@inproceedings{won2021emotion,
  title={Emotion embedding spaces for matching music to stories},
  author={Won, Minz. and Salamon, Justin. and Bryan, Nicholas J. and Mysore, Gautham J. and Serra, Xavier.},
  booktitle={ISMIR},
  year={2021}
}

Requirements

conda create -n YOUR_ENV_NAME python=3.7
conda activate YOUR_ENV_NAME
pip install -r requirements.txt

Data

  • You need to collect audio files of AudioSet mood subset (link).

  • Read the audio files and store them into .npy format.

  • Other relevant data including Alm's dataset (original link), ISEAR dataset (original link), emotion embeddings, pretrained Word2Vec, and data splits are all available here (link).

  • Unzip ttm_data.tar.gz and locate the extracted data folder under text2music-emotion-embedding/.

Training

Here is an example for training a metric learning model.

python3 src/metric_learning/main.py \
        --dataset 'isear' \
        --num_branches 3 \
        --data_path YOUR_DATA_PATH_TO_AUDIOSET

Fore more examples, check bash files under scripts folder.

Test

Here is an example for the test.

python3 src/metric_learning/main.py \
        --mode 'TEST' \
        --dataset 'alm' \
        --model_load_path 'data/pretrained/alm_cross.ckpt' \
        --data_path 'YOUR_DATA_PATH_TO_AUDIOSET'

Pretrained three-branch metric learning models (alm_cross.ckpt and isear_cross.ckpt) are included in ttm_data.tar.gz. This code is reproducible by locating the unzipped data folder under text2music-emotion-embedding/.

Visualization

Embedding distribution of each model can be projected onto 2-dimensional space. We used uniform manifold approximation and projection (UMAP) to visualize the distribution. UMAP is known to preserve more of global structure compared to t-SNE.

Demo

Please try some examples done by the three-branch metric learning model [Soundcloud].

License

Some License
Owner
Minz Won
Exploring music semantics with machines
Minz Won
Adversarial Autoencoders

Adversarial Autoencoders (with Pytorch) Dependencies argparse time torch torchvision numpy itertools matplotlib Create Datasets python create_datasets

Felipe Ducau 188 Jan 01, 2023
Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

7 Jun 22, 2022
PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

48 Dec 08, 2022
Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models"

Introduction Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models". In this work, we demonstrate that existi

Wei-Cheng Tseng 7 Nov 01, 2022
An educational AI robot based on NVIDIA Jetson Nano.

JetBot Looking for a quick way to get started with JetBot? Many third party kits are now available! JetBot is an open-source robot based on NVIDIA Jet

NVIDIA AI IOT 2.6k Dec 29, 2022
Open-Ended Commonsense Reasoning (NAACL 2021)

Open-Ended Commonsense Reasoning Quick links: [Paper] | [Video] | [Slides] | [Documentation] This is the repository of the paper, Differentiable Open-

(Bill) Yuchen Lin 31 Oct 19, 2022
Code and data for "Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning" (EMNLP 2021).

GD-VCR Code for Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning (EMNLP 2021). Research Questions and Aims: How well can a model perform o

Da Yin 24 Oct 13, 2022
Advancing mathematics by guiding human intuition with AI

Advancing mathematics by guiding human intuition with AI This repo contains two colab notebooks which accompany the paper, available online at https:/

DeepMind 315 Dec 26, 2022
10th place solution for Google Smartphone Decimeter Challenge at kaggle.

Under refactoring 10th place solution for Google Smartphone Decimeter Challenge at kaggle. Google Smartphone Decimeter Challenge Global Navigation Sat

12 Oct 25, 2022
[CVPR 2021 Oral] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis

ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis [arxiv|pdf|v

Yinan He 78 Dec 22, 2022
Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation

NorCal Normalization Calibration (NorCal) for Long-Tailed Object Detection and Instance Segmentation On Model Calibration for Long-Tailed Object Detec

Tai-Yu (Daniel) Pan 24 Dec 25, 2022
The 3rd place solution for competition

The 3rd place solution for competition "Lyft Motion Prediction for Autonomous Vehicles" at Kaggle Team behind this solution: Artsiom Sanakoyeu [Homepa

Artsiom 104 Nov 22, 2022
A general framework for inferring CNNs efficiently. Reduce the inference latency of MobileNet-V3 by 1.3x on an iPhone XS Max without sacrificing accuracy.

GFNet-Pytorch (NeurIPS 2020) This repo contains the official code and pre-trained models for the glance and focus network (GFNet). Glance and Focus: a

Rainforest Wang 169 Oct 28, 2022
Repository for the "Gotta Go Fast When Generating Data with Score-Based Models" paper

Gotta Go Fast When Generating Data with Score-Based Models This repo contains the official implementation for the paper Gotta Go Fast When Generating

Alexia Jolicoeur-Martineau 89 Nov 09, 2022
(CVPR 2022) Pytorch implementation of "Self-supervised transformers for unsupervised object discovery using normalized cut"

(CVPR 2022) TokenCut Pytorch implementation of Tokencut: Self-supervised Transformers for Unsupervised Object Discovery using Normalized Cut Yangtao W

YANGTAO WANG 200 Jan 02, 2023
UnpNet - Rethinking 3-D LiDAR Point Cloud Segmentation(IEEE TNNLS)

UnpNet Citation Please cite the following paper if you use this repository in your reseach. @article {PMID:34914599, Title = {Rethinking 3-D LiDAR Po

Shijie Li 4 Jul 15, 2022
Training DALL-E with volunteers from all over the Internet using hivemind and dalle-pytorch (NeurIPS 2021 demo)

Training DALL-E with volunteers from all over the Internet This repository is a part of the NeurIPS 2021 demonstration "Training Transformers Together

<a href=[email protected]"> 19 Dec 13, 2022
Learning to Reach Goals via Iterated Supervised Learning

Vanilla GCSL This repository contains a vanilla implementation of "Learning to Reach Goals via Iterated Supervised Learning" proposed by Dibya Gosh et

Christoph Heindl 4 Aug 10, 2022
Download from Onlyfans.com.

OnlySave: Onlyfans downloader Getting Started: Download the setup executable from the latest release. Install and run. Only works on Windows currently

4 May 30, 2022
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

Autoformer (NeurIPS 2021) Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting Time series forecasting is a c

THUML @ Tsinghua University 847 Jan 08, 2023