[email protected] Reverb Database. | PythonRepo" /> [email protected] Reverb Database. | PythonRepo">

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT [email protected] Reverb Database.

Overview

Add_noise_and_rir_to_speech

The purpose of this code base is to add a specified signal-to-noise ratio noise from MUSAN dataset to a pure speech signal and to generate far-field speech data using room impulse response data from BUT [email protected] Reverb Database.

Noise and RIR dataset description:

  • BUT [email protected] Reverb Database:

    The database is being built with respect to collect a large number of various Room Impulse Responses, Room environmental noises (or "silences"), Retransmitted speech (for ASR and SID testing), and meta-data (positions of microphones, speakers etc.).

    The goal is to provide speech community with a dataset for data enhancement and distant microphone or microphone array experiments in ASR and SID.

    In this codebase, we only use the RIR data, which is used to synthesize far-field speech, the composition of the RIR dataset and citation details are as follows.

    Room Name Room Type Size (length, depth, height) (m) (microphone_num x loudspeaker_num)
    Q301 Office 10.7x6.9x2.6 31 x 3
    L207 Office 4.6x6.9x3.1 31 x 6
    L212 Office 7.5x4.6x3.1 31 x 5
    L227 Stairs 6.2x2.6x14.2 31 x 5
    R112 Hotel room 4.4x2.8x2.6 31 x 5
    CR2 Conference room 28.2x11.1x3.3 31 x 4
    E112 Lecture room 11.5x20.1x4.8 31 x 2
    D105 Lecture room 17.2x22.8x6.9 31 x 6
    C236 Meeting room 7.0x4.1x3.6 31 x 10
    @ARTICLE{8717722,
             author={Szöke, Igor and Skácel, Miroslav and Mošner, Ladislav and Paliesek, Jakub and Černocký, Jan},
             journal={IEEE Journal of Selected Topics in Signal Processing}, 
             title={Building and evaluation of a real room impulse response dataset}, 
             year={2019},
             volume={13},
             number={4},
             pages={863-876},
             doi={10.1109/JSTSP.2019.2917582}
     }
    
  • MUSAN database:

    The database consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises and we only use the noise data in this database. Citation details are as follows.

    @misc{snyder2015musan,
          title={MUSAN: A Music, Speech, and Noise Corpus}, 
          author={David Snyder and Guoguo Chen and Daniel Povey},
          year={2015},
          eprint={1510.08484},
          archivePrefix={arXiv},
          primaryClass={cs.SD}
    }
    

Before using the data-processing code:

  • If you do not want the original dataset to be overwritten, please download the dataset again for use

  • You need to create three files: 'training_list.txt', 'validation_list.txt', 'testing_list.txt', based on your training, validation and test data file paths respectively, and ensure the audio in the file paths can be read and written.

  • The content of the aforementioned '*_list.txt' files are in the following form:

    *_list.txt
    	/../...../*.wav
    	/../...../*.wav
    	/../...../*.wav
    

Instruction for using the following data-processing code:

  1. mix_cleanaudio_with_rir_offline.py: Generate far-field speech offline

    • two parameters are needed:

      • --data_root: the data path which you want to download and store the RIR dataset in.
      • --clean_data_list_path: the path of the folder in which 'training_list.txt', 'validation_list.txt', 'testing_list.txt' are stored in
    • 2 folders will be created in data_root: 'ReverDB_data (Removable if needed)', 'ReverDB_mix'

  2. download_and_extract_noise_file.py: Generate musan noise file

    • one parameters are needed:
      • --data_root: the data path which you want to download and store the noise dataset in.
    • 2 folder will be created in data_root: 'musan (Removable if needed)', 'noise'
  3. vad_torch.py: Voice activity detection when adding noise to the speech

    The noise data is usually added online according to the SNR requirements, several pieces of code are provided below, please add them in the appropriate places according to your needs!

    import torchaudio
    import numpy as np
    import torch
    import random
    from vad_torch import VoiceActivityDetector
    
    
    def _add_noise(speech_sig, vad_duration, noise_sig, snr):
        """add noise to the audio.
        :param speech_sig: The input audio signal (Tensor).
        :param vad_duration: The length of the human voice (int).
        :param noise_sig: The input noise signal (Tensor).
        :param snr: the SNR you want to add (int).
        :returns: noisy speech sig with specific snr.
        """
        if vad_duration != 0:
            snr = 10**(snr/10.0)
            speech_power = torch.sum(speech_sig**2)/vad_duration
            noise_power = torch.sum(noise_sig**2)/noise_sig.shape[1]
            noise_update = noise_sig / torch.sqrt(snr * noise_power/speech_power)
    
            if speech_sig.shape[1] > noise_update.shape[1]:
                # padding
                temp_wav = torch.zeros(1, speech_sig.shape[1])
                temp_wav[0, 0:noise_update.shape[1]] = noise_update
                noise_update = temp_wav
            else:
                # cutting
                noise_update = noise_update[0, 0:speech_sig.shape[1]]
    
            return noise_update + speech_sig
        
        else:
            return speech_sig
        
    def main():
        # loading speech file
        speech_file = './speech.wav'
    	waveform, sr = torchaudio.load(speech_file)
    	waveform = waveform - waveform.mean()
    	
        # loading noise file and set snr
    	snr = 0       
    	noise_file = random.randint(1, 930)
    	
        # Voice activity detection
    	v = VoiceActivityDetector(waveform, sr)
    	raw_detection = v.detect_speech()
    	speech_labels = v.convert_windows_to_readible_labels(raw_detection)
    	vad_duration = 0
        if not len(speech_labels) == 0:
            for i in range(len(speech_labels)):
                start = speech_labels[i]['speech_begin']
                end = speech_labels[i]['speech_end']
                vad_duration = vad_duration + end-start
                
    	# adding noise
        noise, _ = torchaudio.load('/notebooks/noise/' + str(noise_file) + '.wav')
        waveform = _add_noise(waveform, vad_duration, noise, snr)
    
    if __name__ == '__main__':
        main()
Owner
Yunqi Chen
3rd-year undergraduate student; Passionate about all kinds of sports and everything interesting!
Yunqi Chen
Test for using pyIIIFpres for rara magnetica project

raramagnetica_pyIIIFpres Test for using pyIIIFpres for rara magnetica project. This test show how to use pyIIIFpres for creating mannifest compliant t

Giacomo Marchioro 1 Dec 03, 2021
Kunai Shitty Raider Leaked LMFAO

Kunai-Raider-Leaked Kunai Shitty Raider Leaked LMFA

5 Nov 24, 2021
LINUX-AOS (Automatic Optimization System)

LINUX-AOS (Automatic Optimization System)

1 Jul 12, 2022
Python Multilingual Ucrel Semantic Analysis System

PymUSAS Python Multilingual Ucrel Semantic Analysis System, it currently is a rule based token level semantic tagger which can be added to any spaCy p

UCREL 13 Nov 18, 2022
Developed a website to analyze and generate report of students based on the curriculum that represents student’s academic performance.

Developed a website to analyze and generate report of students based on the curriculum that represents student’s academic performance. We have developed the system such that, it will automatically pa

VIJETA CHAVHAN 3 Nov 08, 2022
Xbox-Flood is for flood anything

Intruduction Installation Usage Installing Python 3 Wiki Getting Started Creating a Key Intruduction Xbox-Flood is for flooding messages (invitations

kayake 4 Feb 18, 2022
Script to work around some quirks of the blender obj importer

ObjFix 1.0 (WIP) Script to work around some quirks of the blender obj importer Installation Download this repo In Blender, press "Edit" on the top-bar

Red_3D 4 Nov 20, 2021
An attempt at furthering Factorio Calculator to work in more general contexts.

factorio-optimizer Lets do Factorio Calculator but make it optimize. Why not use Factorio Calculator? Becuase factorio calculator is not general. The

Jonathan Woollett-Light 1 Jun 03, 2022
A Red Team tool for exfiltrating sensitive data from Jira tickets.

Jir-thief This Module will connect to Jira's API using an access token, export to a word .doc, and download the Jira issues that the target has access

Antonio Piazza 82 Dec 12, 2022
edgetest is a tox-inspired python library that will loop through your project's dependencies, and check if your project is compatible with the latest version of each dependency

Bleeding edge dependency testing Full Documentation edgetest is a tox-inspired python library that will loop through your project's dependencies, and

Capital One 16 Dec 07, 2022
A python script to decrypt media files encrypted using the Android application 'Secret Calculator Photo Vault'. Supports brute force of PIN also.

A python script to decrypt media files encrypted using the Android application 'Secret Calculator Photo Vault'. Supports brute force of PIN also.

3 May 01, 2022
An app that mirrors your phone to your compute and maps controller input to the screen

What is 'Dragalia Control'? An app that mirrors your phone to your compute and maps controller input to the screen. Inputs are mapped specifically for

1 May 03, 2022
A supercharged version of paperless: scan, index and archive all your physical documents

Paperless-ng Paperless (click me) is an application by Daniel Quinn and contributors that indexes your scanned documents and allows you to easily sear

Jonas Winkler 5.3k Jan 09, 2023
Яндекс тренировки по алгоритмам. Июнь 2021

Young&&Yandex Тренировки по алгоритмам Если вы хотите попасть на летнюю стажировку в Яндекс, но пока не уверены в своих силах, приходите на наши трени

Podlevskiy Viktor 6 Sep 03, 2021
In this project, we'll be creating a virtual personal assistant for ourselves using our favorite programming language

In this project, we'll be creating a virtual personal assistant for ourselves using our favorite programming language, Python. We can perform several offline as well as online operations using the bo

Ashutosh Krishna 188 Jan 03, 2023
Various hdas (Houdini Digital Assets)

aaTools My various assets for Houdini "ms_asset_loader" - Custom importer assets from Quixel Bridge "asset_placer" - Tool for placment sop geometry on

9 Dec 19, 2022
Pyrmanent - Make all your classes permanent in a flash 💾

Pyrmanent A base class to make your Python classes permanent in a flash. Features Easy to use. Great compatibility. No database needed. Ask for new fe

Sergio Abad 4 Jan 07, 2022
A tool to improve Boolean satisfiability (SAT) solver user's life

SatHelper This is a tool to improve the Boolean satisfiability (SAT) and MaxSAT solver user's life. It helps you model various problems as SAT and Max

Tomas Balyo 1 Nov 16, 2021
Implent of Oracle Base line and Lea-3 Baseline

Oracle-Baseline Implent of Oracle Base line and Lea-3 Baseline Oracle Oracle : This model is used to obtain an oracle with a greedy algorithm similar

Andrew Zeng 2 Nov 12, 2021
Reso is a low-level circuit design language and simulator, inspired by things like Redstone, Conway's Game of Life, and Wireworld.

Reso Reso is a low-level circuit design language and simulator, inspired by things like Redstone, Conway's Game of Life, and Wireworld. What is Reso?

Lynn 287 Nov 26, 2022