《Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis》(2021)

Overview

Image2Reverb

Image2Reverb is an end-to-end neural network that generates plausible audio impulse responses from single images of acoustic environments. Code for the paper Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis. The architecture is a conditional GAN with a ResNet50 (pre-trained on Places365 and fine-tuned) image encoder. It generates monoaural audio impulse responses (directly applicable to convolution applications) as magnitude spectrograms.

Dependencies

Model/Data:

  • PyTorch>=1.7.0
  • PyTorch Lightning
  • torchvision
  • torchaudio
  • librosa
  • PyRoomAcoustics
  • PIL

Eval/Preprocessing:

  • PySoundfile
  • SciPy
  • Scikit-Learn
  • python-acoustics
  • google-images-download
  • matplotlib

Usage

We will make a pre-trained model available soon!

Acknowledgments

We borrow and adapt code snippets from GANSynth (and this PyTorch re-implementation), additional snippets from this PGGAN implementation, and more.

Owner
Nikhil Singh
Nikhil Singh
Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation By Qiang Zhou*, Zilong Huang*, Lichao Huang, Han Shen, Yon

Forest 117 Apr 01, 2022
Answering Open-Domain Questions of Varying Reasoning Steps from Text

This repository contains the authors' implementation of the Iterative Retriever, Reader, and Reranker (IRRR) model in the EMNLP 2021 paper "Answering Open-Domain Questions of Varying Reasoning Steps

26 Dec 22, 2022
Code for layerwise detection of linguistic anomaly paper (ACL 2021)

Layerwise Anomaly This repository contains the source code and data for our ACL 2021 paper: "How is BERT surprised? Layerwise detection of linguistic

6 Dec 07, 2022
PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos

PyKale is a PyTorch library for multimodal learning and transfer learning as well as deep learning and dimensionality reduction on graphs, images, texts, and videos. By adopting a unified pipeline-ba

PyKale 370 Dec 27, 2022
TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1).

M1-tensorflow-benchmark TensorFlow (v2.7.0) benchmark results on an M1 Macbook Air 2020 laptop (macOS Monterey v12.1). I was initially testing if Tens

particle 2 Jan 05, 2022
moving object detection for satellite videos.

DSFNet: Dynamic and Static Fusion Network for Moving Object Detection in Satellite Videos Algorithm Introduction DSFNet: Dynamic and Static Fusion Net

xiaochao 39 Dec 16, 2022
ICON: Implicit Clothed humans Obtained from Normals

ICON: Implicit Clothed humans Obtained from Normals arXiv, December 2021. Yuliang Xiu · Jinlong Yang · Dimitrios Tzionas · Michael J. Black Table of C

Yuliang Xiu 1.1k Dec 30, 2022
EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling This is the official implementation for "Frustratingly Simple Pretraining Al

Atsuki Yamaguchi 31 Nov 18, 2022
Code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

Biomedical Entity Linking This repo provides the code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Res

Tuan Manh Lai 24 Oct 24, 2022
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL

3 Dec 26, 2022
chen2020iros: Learning an Overlap-based Observation Model for 3D LiDAR Localization.

Overlap-based 3D LiDAR Monte Carlo Localization This repo contains the code for our IROS2020 paper: Learning an Overlap-based Observation Model for 3D

Photogrammetry & Robotics Bonn 219 Dec 15, 2022
Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

Denoised-Smoothing-TF Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow. Denoised Smoothing is

Sayak Paul 19 Dec 11, 2022
DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction (3DV 2021)

DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction (3DV 2021) This repo is the implementation of DPC. Tested environment Pyth

Dvir Ginzburg 30 Nov 30, 2022
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier

LSTMs for Human Activity Recognition Human Activity Recognition (HAR) using smartphones dataset and an LSTM RNN. Classifying the type of movement amon

Guillaume Chevalier 3.1k Dec 30, 2022
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Nerdy Rodent 2.3k Jan 04, 2023
Code for the Paper: Alexandra Lindt and Emiel Hoogeboom.

Discrete Denoising Flows This repository contains the code for the experiments presented in the paper Discrete Denoising Flows [1]. To give a short ov

Alexandra Lindt 3 Oct 09, 2022
A python library for time-series smoothing and outlier detection in a vectorized way.

tsmoothie A python library for time-series smoothing and outlier detection in a vectorized way. Overview tsmoothie computes, in a fast and efficient w

Marco Cerliani 517 Dec 28, 2022
Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++).

Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++). Built in TensorFlow 2.5. Configured for vox

Diagnostic Image Analysis Group 32 Dec 08, 2022