Style-based Neural Drum Synthesis with GAN inversion

Overview

Style-based Drum Synthesis with GAN Inversion Demo

TensorFlow implementation of a style-based version of the adversarial drum synth (ADS) from the paper Adversarial Synthesis of Drum Sounds @ The 2020 DAFx Conference.

neural drum synthesis

Code

Dependencies

Python

Code has been developed with Python 3.6.13. It should work with other versions of Python 3, but has not been tested. Moreover, we rely on several third-party libraries, listed in requirements.txt. They can be installed with

$ pip install -r requirements.txt

Checkpoints

The tensorflow checkpoints for loading pre-trained network weights can be download here. Unzip the folder and save it into this projects directory: "style-drumsynth/checkpoints".

Usage

The code is contained within the ads_demo.py script, which enables conditional synthesises of drum sounds using a pretrained generator.

The following control parameters are available:

  • Condition: which type of drum to generation (kick, snare or hat)
  • Direction: "features", which principal direction to move in
  • Direction slider: How far to move in a particular direction
  • Number of generations: How many drums to generate
  • Stocastic Variation: Amount of inconsequential noise to inject into the generator
  • Randomize: Generate by randomly sampling the latent space, or generate from a fixed, pre-computed latent vectors for a kick, snare and hat
  • Encode: regenerate drum sounds stored in the ads_demo/input_audio

Generations are saved in the ads_demo/generations folder. Pretrained model weights are saved in the ads_demo/checkpoints folder.

train.py arguments

  -c CONDITION,           --condition CONDITION
                            0: kick, 1: snare, 2:hat
  -d DIRECTION,           --direction DIRECTION
                            synthesis controls [0:4]
  -ds DIRECTION_SLIDER,   --direction_slider DIRECTION_SLIDER
                            how much to move in a particular direction
  -n NUM_GENERATIONS,     --num_generations NUM_GENERATIONS
                            number of examples to generate
  -v STOCASTIC_VARIATION, --stocastic_variation STOCASTIC_VARIATION
                            amount of inconsequential noise injected
  -r RANDOMIZE,           --randomize RANDOMIZE
                            if set to False, a fixed latent vector is used to generate a drum sound from each condition
  -e ENCODE,              --encode ENCODE
                            regenerates drum sounds from encoder folder

Supporting webpage

For more information, please visit the corresponding supporting website.

It contains the following:

  • Audio examples
  • Training data
  • Generations
  • Example usage within loop-based electronic music compositions
  • Generating Drum Loops
  • Interpolation demonstration
  • Supplementary figures
  • A link to the DAFx 2020 paper and presentation

References

[1] Drysdale, J., M. Tomczak, J. Hockman, Adversarial Synthesis of Drum Sounds. Proceedings of the 23rd International Conference on Digital Audio Effects (DAFX), 2020.
@inproceedings{drysdale2020ads,
  title={Adversarial synthesis of drum sounds},
  author={Drysdale, Jake and Tomczak, Maciek and Hockman, Jason},
  booktitle = {Proceedings of the International Conference on Digital Audio Effects (DAFx)},
  year={2020}
}

Help

Any questions please feel free to contact me on [email protected]

Owner
Sound and Music Analysis (SoMA) Group
The Sound and Music Analysis (SoMA) Group in the Digital Media Technology Laboratory at Birmingham City University.
Sound and Music Analysis (SoMA) Group
Code accompanying the paper "Knowledge Base Completion Meets Transfer Learning"

Knowledge Base Completion Meets Transfer Learning This code accompanies the paper Knowledge Base Completion Meets Transfer Learning published at EMNLP

14 Nov 27, 2022
Landmarks Recogntion Web application using Streamlit.

Landmark Recognition Web-App using Streamlit Watch Tutorial for this project Source Trained model landmarks_classifier_asia_V1/1 is taken from the Ten

Kushal Bhavsar 5 Dec 12, 2022
Space-event-trace - Tracing service for spaceteam events

space-event-trace Tracing service for TU Wien Spaceteam events. This service is

TU Wien Space Team 2 Jan 04, 2022
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

MEAL-V2 This is the official pytorch implementation of our paper: "MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tric

Zhiqiang Shen 653 Dec 19, 2022
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 08, 2023
A series of Jupyter notebooks with Chinese comment that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.

Hands-on-Machine-Learning 目的 这份笔记旨在帮助中文学习者以一种较快较系统的方式入门机器学习, 是在学习Hands-on Machine Learning with Scikit-Learn and TensorFlow这本书的 时候做的个人笔记: 此项目的可取之处 原书的

Baymax 1.5k Dec 21, 2022
[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

CORE This is the official PyTorch implementation for the paper: Yupeng Hou, Binbin Hu, Zhiqiang Zhang, Wayne Xin Zhao. CORE: Simple and Effective Sess

RUCAIBox 26 Dec 19, 2022
Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss (ATVGnet)

Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss (ATVGnet) By Lele Chen , Ross K Maddox, Zhiyao Duan, Chenliang Xu. Unive

Lele Chen 218 Dec 27, 2022
Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer"

StyleAttack Code and data of the EMNLP 2021 paper "Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer" Prepare Pois

THUNLP 19 Nov 20, 2022
PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Representation

How to Reproduce our Results This repository contains PyTorch implementation code for the paper MixCo: Mix-up Contrastive Learning for Visual Represen

opcrisis 46 Dec 15, 2022
A Free and Open Source Python Library for Multiobjective Optimization

Platypus What is Platypus? Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs)

Project Platypus 424 Dec 18, 2022
🤗 Paper Style Guide

🤗 Paper Style Guide (Work in progress, send a PR!) Libraries to Know booktabs natbib cleveref Either seaborn, plotly or altair for graphs algorithmic

Hugging Face 66 Dec 12, 2022
3D Avatar Lip Syncronization from speech (JALI based face-rigging)

visemenet-inference Inference Demo of "VisemeNet-tensorflow" VisemeNet is an audio-driven animator centric speech animation driving a JALI or standard

Junhwan Jang 17 Dec 20, 2022
DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations This repository contains the data, scripts and baseline co

Alexa 51 Dec 17, 2022
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

Microsoft 125 Jan 04, 2023
Some bravo or inspiring research works on the topic of curriculum learning.

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtu

131 Jan 07, 2023
Another pytorch implementation of FCN (Fully Convolutional Networks)

FCN-pytorch-easiest Trying to be the easiest FCN pytorch implementation and just in a get and use fashion Here I use a handbag semantic segmentation f

Y. Dong 158 Dec 21, 2022
CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
Visualizing lattice vibration information from phonon dispersion to atoms (For GPUMD)

Phonon-Vibration-Viewer (For GPUMD) Visualizing lattice vibration information from phonon dispersion for primitive atoms. In this tutorial, we will in

Liangting 6 Dec 10, 2022
Datasets, Transforms and Models specific to Computer Vision

torchvision The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. Installat

13.1k Jan 02, 2023