Codebase for ECCV18 "The Sound of Pixels"

Last update: Dec 20, 2022

Overview

Sound-of-Pixels

Codebase for ECCV18 "The Sound of Pixels".

*This repository is under construction, but the core parts are already there.

Environment

The code is developed under the following configurations.

Hardware: 1-4 GPUs (change [--num_gpus NUM_GPUS] accordingly)
Software: Ubuntu 16.04.3 LTS, CUDA>=8.0, Python>=3.5, PyTorch>=0.4.0

Training

Prepare video dataset.

a. Download MUSIC dataset from: https://github.com/roudimit/MUSIC_dataset

b. Download videos.

Preprocess videos. You can do it in your own way as long as the index files are similar.

a. Extract frames at 8fps and waveforms at 11025Hz from videos. We have following directory structure:

data
├── audio
|   ├── acoustic_guitar
│   |   ├── M3dekVSwNjY.mp3
│   |   ├── ...
│   ├── trumpet
│   |   ├── STKXyBGSGyE.mp3
│   |   ├── ...
│   ├── ...
|
└── frames
|   ├── acoustic_guitar
│   |   ├── M3dekVSwNjY.mp4
│   |   |   ├── 000001.jpg
│   |   |   ├── ...
│   |   ├── ...
│   ├── trumpet
│   |   ├── STKXyBGSGyE.mp4
│   |   |   ├── 000001.jpg
│   |   |   ├── ...
│   |   ├── ...
│   ├── ...

b. Make training/validation index files by running:

python scripts/create_index_files.py

It will create index files train.csv/val.csv with the following format:

./data/audio/acoustic_guitar/M3dekVSwNjY.mp3,./data/frames/acoustic_guitar/M3dekVSwNjY.mp4,1580
./data/audio/trumpet/STKXyBGSGyE.mp3,./data/frames/trumpet/STKXyBGSGyE.mp4,493

For each row, it stores the information: AUDIO_PATH,FRAMES_PATH,NUMBER_FRAMES

Train the default model.

./scripts/train_MUSIC.sh

During training, visualizations are saved in HTML format under ckpt/MODEL_ID/visualization/.

Evaluation

(Optional) Download our trained model weights for evaluation.

./scripts/download_trained_model.sh

Evaluate the trained model performance.

./scripts/eval_MUSIC.sh

Reference

If you use the code or dataset from the project, please cite:

    @InProceedings{Zhao_2018_ECCV,
        author = {Zhao, Hang and Gan, Chuang and Rouditchenko, Andrew and Vondrick, Carl and McDermott, Josh and Torralba, Antonio},
        title = {The Sound of Pixels},
        booktitle = {The European Conference on Computer Vision (ECCV)},
        month = {September},
        year = {2018}
    }

Codebase for ECCV18 "The Sound of Pixels"

Related tags

Overview

Sound-of-Pixels

Environment

Training

Evaluation

Reference

Owner

Hang Zhao

This repository is the code of the paper Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary Strategies

CNN designed for pansharpening

Official PyTorch implemention of our paper "Learning to Rectify for Robust Learning with Noisy Labels".

implementation of the paper "MarginGAN: Adversarial Training in Semi-Supervised Learning"

Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

Traditional deepdream with VQGAN+CLIP and optical flow. Ready to use in Google Colab

Provide baselines and evaluation metrics of the task: traffic flow prediction

PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"

An end-to-end machine learning web app to predict rugby scores (Pandas, SQLite, Keras, Flask, Docker)

Fight Recognition from Still Images in the Wild @ WACVW2022, Real-world Surveillance Workshop

MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Galileo library for large scale graph training by JD

Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

PyTorch implementation of Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction (ICCV 2021).

Computer vision - fun segmentation experience using classic and deep tools :)

Episodic-memory - Ego4D Episodic Memory Benchmark

Language Models Can See: Plugging Visual Controls in Text Generation

Framework for Spectral Clustering on the Sparse Coefficients of Learned Dictionaries

Efficient Multi Collection Style Transfer Using GAN

Get started with Machine Learning with Python - An introduction with Python programming examples