Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Last update: Jul 06, 2022

Overview

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Reference

 Abeßer, J. & Müller, M. Towards Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning, submitted to: ICASSP 2022

Related Work

we use pre-computed features & model architecture used in 3 previous papers
- these are all unsupervised domain adaptation methods

    Mezza, A. I., Habets, E. A. P., Müller, M., & Sarti, A. (2021).
    #Unsupervised domain adaptation for acoustic scene classification
    using band-wise statistics matching. Proceedings of the European
    Signal Processing Conference (EUSIPCO), 11–15.
    https://doi.org/10.23919/Eusipco47968.2020.9287533"

    Drossos, K., Magron, P., & Virtanen, T. (2019). Unsupervised Adversarial Domain Adaptation based
    on the Wasserstein Distance for Acoustic Scene Classification. Proceedings of the IEEE Workshop
    on Applications of Signal Processing to Audio and Acoustics (WASPAA), 259–263. New Paltz, NY, USA.

    Gharib, S., Drossos, K., Emre, C., Serdyuk, D., & Virtanen, T. (2018). Unsupervised Adversarial Domain
    Adaptation for Acoustic Scene Classification. Proceedings of the Detection and Classification of
    Acoustic Scenes and Events (DCASE). Surrey, UK.

Files

configs.py - Training configurations (C0 ... C3M)
generator.py - Data generator
losses.py - Loss implementations
model.py - Function to create dual-input / dual-output model
model_kaggle.py - reference CNN model from related work for acoustic scene classification (ASC)
normalization.py - Normalization methods (see Mezza et al. above)
params.py - General parameters
prediction.py - Prediction script to evaluate models on test data
training.py - Script to run the model training for 6 different configurations (see Fig. 2 in the paper)

How to run

create python environment (e.g. with conda), the following versions were used during the paper preparation process
- librosa==0.8.0
- matplotlib==3.3.2
- numpy=1.19.2
- python=3.7.0
- scikit-learn==0.23.2
- tensorflow==2.3.0
- torch==1.9.0
set in params.py the following variables
- dir_feat to your local copy of the .p files from https://zenodo.org/record/1401995
- dir_target to your local output folder
run python training.py && python prediction.py on a GPU device to train & evaluate the models

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Related tags

Overview

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Reference

Related Work

Files

How to run

Owner

Jakob Abeßer

Visual Tracking by TridenAlign and Context Embedding

A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python

The repository forked from NVlabs uses our data. (Differentiable rasterization applied to 3D model simplification tasks)

FOSS Digital Asset Distribution Platform built on Frappe.

An e-commerce company wants to segment its customers and determine marketing strategies according to these segments.

A SAT-based sudoku solver

Simulation of the solar system using various nummerical methods

Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

This is the code for the paper "Contrastive Clustering" (AAAI 2021)

An SE(3)-invariant autoencoder for generating the periodic structure of materials

GBIM(Gesture-Based Interaction map)

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception [NeurIPS 2021]

(CVPR 2022) Pytorch implementation of "Self-supervised transformers for unsupervised object discovery using normalized cut"

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

DockStream: A Docking Wrapper to Enhance De Novo Molecular Design

Realtime segmentation with ENet, the fast and accurate segmentation net.

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation".

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO