L3DAS22 challenge supporting API

Overview

L3DAS22 challenge supporting API

This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results. We provide easy-to-use instruction to produce the results included in our paper. Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge website and to the challenge documentation.

Installation

Our code is based on Python 3.7.

To install all required dependencies run:

pip install -r requirements.txt

Follow these instructions to properly create and place the kaggle.json file.

Dataset download

It is possible to download the entire dataset through the script download_dataset.py. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:

python download_dataset.py --output_path ./DATASETS --unzip True

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from Kaggle.

The train360 section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360(). Example:

import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)

Pre-processing

The file preprocessing.py provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:

python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100

The two tasks of the challenge require different pre-processing.

For Task1 the function returns 2 numpy arrays contatining:

  • Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
  • Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For Task2 the function returns 2 numpy arrays contatining:

  • Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
  • Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).

Baseline models

We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:

python train_baseline_task1.py
python train_baseline_task2.py

These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:

python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained

These models are also available for manual download here.

We also provide a Replicate interactive demo of both baseline models.

Evaluaton metrics

Our evaluation metrics for both tasks are included in the metrics.py script. The functions task1_metric and location_sensitive_detection compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:

import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)

To compute the challenge metrics for our basiline models run:

python evaluate_baseline_task1.py
python evaluate_baseline_task2.py

In case you want to evaluate our pre-trained models, please add --model_path path/to/model to the above commands.

Submission shape validation

The script validate_submission.py can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das website Use these commands to validate your submissions:

python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder

For each task, this script asserts if:

  • The number of submitted files is correct
  • The naming of the submitted files is correct
  • Only the files to be submitted are present in the folder
  • The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.

Owner
L3DAS
The L3DAS project aims at providing new 3D audio datasets and encouraging the proliferation of new deep learning methods for 3D audio analysis.
L3DAS
MVP monorepo to rapidly develop scalable, reliable, high-quality components for Amazon Linux instance configuration management

Ansible Amazon Base Repository Ansible Amazon Base Repository About Setting Up Ansible Environment Configuring Python VENV and Ansible Editor Configur

Artem Veremey 1 Aug 06, 2022
Utilizing the freqtrade high-frequency cryptocurrency trading framework to build and optimize trading strategies. The bot runs nonstop on a Rasberry Pi.

Freqtrade Strategy Repository Please test all scripts and dry run them before using them in live mode Contact me on discord if you have any questions!

Michael Fourie 90 Jan 01, 2023
DEPRECATED - Official Python Client for the Discogs API

โš ๏ธ DEPRECATED This repository is no longer maintained. You can still use a REST client like Requests or other third-party Python library to access the

Discogs 483 Dec 31, 2022
Simple Bot With Python 3.8+ For Converstaion Your Media

Media-Conversation Simple Bot With Python 3.8+ For Converstaion Your Media

Farzin 2 Dec 06, 2021
Stock trading bot made using the Robinhood API / Python library...

High-Low Stock trading bot made using the Robinhood API / Python library... Index Installation Use Development Notes Installation To Install and run t

Reed Graff 1 Jan 07, 2022
โœจ ๐Ÿ Python SDK for StarkNet.

โœจ ๐Ÿ starknet.py StarkNet SDK for Python ๐Ÿ“˜ Documentation Installation Quickstart Guide API Installation To install this package run pip install stark

Software Mansion 158 Jan 04, 2023
๐€ ๐ฆ๐จ๐๐ฎ๐ฅ๐š๐ซ ๐“๐ž๐ฅ๐ž๐ ๐ซ๐š๐ฆ ๐†๐ซ๐จ๐ฎ๐ฉ ๐ฆ๐š๐ง๐š๐ ๐ž๐ฆ๐ž๐ง๐ญ ๐›๐จ๐ญ ๐ฐ๐ข๐ญ๐ก ๐ฎ๐ฅ๐ญ๐ข๐ฆ๐š๐ญ๐ž ๐Ÿ๐ž๐š๐ญ๐ฎ๐ซ๐ž๐ฌ

๐‡๐จ๐ฐ ๐“๐จ ๐ƒ๐ž๐ฉ๐ฅ๐จ๐ฒ For easiest way to deploy this Bot click on the below button ๐Œ๐š๐๐ž ๐๐ฒ ๐’๐ฎ๐ฉ๐ฉ๐จ๐ซ๐ญ ๐†๐ซ๐จ๐ฎ๐ฉ ๐’๐จ๐ฎ๐ซ๐œ๐ž๐ฌ ๐†๐ž๐ง๐ž?

Mukesh Solanki 2 Oct 06, 2021
Azure Neural Speech Service TTS

Written in Python using the Azure Speech SDK. App.py provides an easy way to create an Text-To-Speech request to Azure Speech and download the wav file. Azure Neural Voices Text-To-Speech enables flu

Rodney 4 Dec 14, 2022
Extend the commitizen tools to create conventional commits and README that link to Jira and GitHub.

cz-github-jira-conventional cz-github-jira-conventional is a plugin for the commitizen tools, a toolset that helps you to create conventional commit m

12 Dec 13, 2022
Telegram bot for logistic - Telegram bot for logistic

ะ”ะตะผะพะฝัั‚ั€ะฐั†ะธะพะฝะฝั‹ะน ั‚ะตะปะตะณั€ะฐะผ-ะฑะพั‚ ะดะปั ะฝัƒะถะด ั‚ั€ะฐะฝัะฟะพั€ั‚ะฝะพะน ะบะพะผะฟะฐะฝะธะธ ะฆะตะปัŒ ะฟั€ะพะตะบั‚ะฐ ะ ะตะฐะปะธะท

M1chigun 1 Feb 05, 2022
๐Ÿš€ An asynchronous python API wrapper meant to replace discord.py - Snappy discord api wrapper written with aiohttp & websockets

Pincer An asynchronous python API wrapper meant to replace discord.py โ— The package is currently within the planning phase ๐Ÿ“Œ Links ๏ฝœJoin the discord

Pincer 125 Dec 26, 2022
A code to match you with the perfect Taylor Swift song for your mood and relationship status.

taylorswift A package for matching your current mood and relationship status to a suitable Taylor Swift song. Requirements: Python 2 or 3, and the num

Megan Mansfield 82 Dec 09, 2022
Muzan-Discord-Nuker - A simple discord server nuker in python

Muzan-Discord-Nuker This is Just a simple discord server nuker in python. โœจ Feat

Afnan 3 May 14, 2022
A python script to download twitter space, only works on running spaces (for now).

A python script to download twitter space, only works on running spaces (for now).

279 Jan 02, 2023
Create CDK projects with projen

The Projenator: I'll be back! Description This is a CDKv2 project that takes the grind out of setting up new cdk projects/implementations by using aut

Andrew 2 Dec 11, 2021
A Telegram mirror bot which can be deployed using Heroku.

Slam Mirror Bot This is a telegram bot writen in python for mirroring files on the internet to our beloved Google Drive. Getting Google OAuth API cred

Hafitz Setya 1.2k Jan 01, 2023
A Phyton script working for stream twits from twitter by tweepy to mongoDB

twitter-to-mongo A python script that uses the Tweepy library to pull Tweets with specific keywords from Twitter's Streaming API, and then stores the

3 Feb 03, 2022
Bitcoin-chance-wheel - Try your luck at getting bitcoins

Program Features - โœ๏ธ Why did we name this tool the Lucky Wheel? - โœ๏ธ This tool

hack4lx 20 Dec 22, 2022
Um bot para contar quantas vezes o meu amigo troca de pfp/nick/tag essas coisas ae pq aquele mlk n para quieto

EkiBot Um bot que tem apenas as suas funรงรตes de audit log com as PFP's (avatares) dos usuรกrios Pode ser usado para um usuรกrio em especรญfico, ou atรฉ me

Samuel 3 Aug 11, 2021
A simple Discord bot wrote with Python. Kizmeow let you track your NFT project and display some useful information

Kizmeow-OpenSea-and-Etherscan-Discord-Bot ไธญๆ–‡็‰ˆ | English Ver A Discord bot wrote with Python. Kizmeow let you track your NFT project and display some u

Xeift 93 Dec 31, 2022