PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Last update: Nov 03, 2022

Related tags

Overview

Foley Music: Learning to Generate Music from Videos

This repo holds the code for the framework presented on ECCV 2020.

Foley Music: Learning to Generate Music from Videos Chuang Gan, Deng Huang, Peihao Chen, Joshua B. Tenenbaum, and Antonio Torralba

paper

Usage Guide

Prerequisites

The training and testing in PGCN is reimplemented in PyTorch for the ease of use.

Pytorch 1.4

Other minor Python modules can be installed by running

pip install -r requirements.txt

Data Preparation

Download Datasets

The extracted pose and midi for training and audio generation can be downloaded here and unzip to ./data folder.

The original datasets (including videos) can be found:

URMP: can be downloaded here
MUSIC: can be downloaded here
AtinPiano: proposed by At Your Fingertips: Automatic Piano Fingering Detection. The dataset can be downloaded here

Training

For URMP

CUDA_VISIBLE_DEVICES=6 python train.py -c config/URMP/violin.conf -e exps/urmp-vn

For AtinPiano

CUDA_VISIBLE_DEVICES=6 python train.py -c config/AtinPiano.conf -e exps/atinpiano

For MUSIC

CUDA_VISIBLE_DEVICES=6 python train.py -c config/MUSIC/accordion.conf -e exps/music-accordion

Generating MIDI, sounds and videos

For URMP

VIDEO_PATH=/path/to/video
INSTRUMENT_NAME='Violin'
python test_URMP.py exps/urmp-vn/checkpoint.pth.tar -o exps/urmp-vn/generate -i Violin -v $VIDEO_PATH -i $INSTRUMENT_NAME

For AtinPiano

VIDEO_PATH=/path/to/video
INSTRUMENT_NAME='Acoustic Grand Piano'
python test_AtinPiano_MUSIC.py exps/atinpiano/checkpoint.pth.tar -o exps/atinpiano/generation -v $VIDEO_PATH -i $INSTRUMENT_NAME

For MUSIC

VIDEO_PATH=/path/to/video
INSTRUMENT_NAME='Accordion'
python test_AtinPiano_MUSIC.py exps/music-accordion/checkpoint.pth.tar -o exps/music-accordion/generation -v $VIDEO_PATH -i $INSTRUMENT_NAME

Notes:

Instrument name ($INSTRUMENT_NAME) can be found here
If you do not have the video file or you want to generate MIDI and audio only, you can add -oa flag to skip the generation of video.

Other Info

Citation

Please cite the following paper if you feel our work useful to your research.

@inproceedings{FoleyMusic2020,
  author    = {Chuang Gan and
               Deng Huang and
               Peihao Chen and
               Joshua B. Tenenbaum and
               Antonio Torralba},
  title     = {Foley Music: Learning to Generate Music from Videos},
  booktitle = {ECCV},
  year      = {2020},
}

PyTorch implementation of ECCV 2020 paper "Foley Music: Learning to Generate Music from Videos "

Related tags

Overview

Foley Music: Learning to Generate Music from Videos

Usage Guide

Prerequisites

Data Preparation

Download Datasets

Training

Generating MIDI, sounds and videos

Other Info

Citation

Owner

Chuang Gan

Fully convolutional deep neural network to remove transparent overlays from images

Probabilistic Gradient Boosting Machines

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.

Pseudo-rng-app - whos needs science to make a random number when you have pseudoscience?

Generative Adversarial Text-to-Image Synthesis

Prediction of MBA refinance Index (Mortgage prepayment)

Example of semantic segmentation in Keras

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"

Open source hardware and software platform to build a small scale self driving car.

This repo is the official implementation for Multi-Scale Adaptive Graph Neural Network for Multivariate Time Series Forecasting

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

A self-supervised learning framework for audio-visual speech

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

A framework to train language models to learn invariant representations.

A Factor Model for Persistence in Investment Manager Performance

使用深度学习框架提取视频硬字幕；docker容器免安装深度学习库，使用本地api接口使得界面和后端识别分离；

Implements the training, testing and editing tools for "Pluralistic Image Completion"

ZEBRA: Zero Evidence Biometric Recognition Assessment

Edge Restoration Quality Assessment