ReTiSAR

Implementation of the Real-Time Spherical Microphone Renderer for binaural reproduction in Python [1][2].

Requirements

macOS (tested on 10.14 Mojave and 10.15 Catalina) or Linux (tested on 5.9.1-1-rt19-MANJARO)
(Windows is not supported due to an incompatibility with the current multiprocessing implementation)
JACK library (prebuilt installers / binaries are available)
Conda installation (miniconda is sufficient; provides an easy way to get Intel MKL or alternatively OpenBLAS optimized numpy versions which is highly recommended)
Python installation (tested with 3.7 to 3.9; recommended way to get Python is to use Conda as described in the setup section)
Installation of the required Python packages (recommended way is to use Conda as described in the setup section)
Optional: Download of publicly available measurement data for alternative execution modes (always check the command line output or log files in case the rendering pipeline does not initialize successfully!)
Optional: Install an OSC client for real-time feedback and remote control options during runtime

Setup

Clone repository with command line or any other git client:
git clone https://github.com/AppliedAcousticsChalmers/ReTiSAR.git
- Alternative: Download and extract snapshot manually from provided URL (not recommended due to not being able to pull updates)
- Alternative: Update your local copy with changes from the repository (if you have cloned it in the past):
  git pull
Navigate into the repository (the directory containing setup.py):
cd ReTiSAR/
Install required Python packages i.e., Conda is recommended:
- Make sure that Conda is up to date:
  conda update conda
- Create new Conda environment from the specified requirements (--force to overwrite potentially existing outdated environment):
  conda env create --file environment.yml --force
- Activate created Conda environment:
  source activate ReTiSAR

Quickstart

Follow requirements and setup instructions
During first execution, some small amount of additional mandatory external measurement data will be downloaded automatically, see remark in execution modes (requires Internet connection)
Start JACK server with desired sampling rate (all demo configurations are in 48 kHz):
jackd -d coreaudio -r 48000 [macOS]
jackd -d alsa -r 48000 [Linux]
Remark: Check e.g. the jackd -d coreaudio -d -l command to specify the audio interface that should be used!
Run package with [default] parameters to hear a binaural rendering of a raw Eigenmike recording:
python -m ReTiSAR
Option 1: Modify the configuration by changing the default parameters in config.py (prepared block comments for the specific execution modes below exist).
Option 2: Modify the configuration by command line arguments (like in the following examples showing different execution parameters and modes, see --help).

JACK initialization — In case you have never started the JACK audio server on your system or want to make sure it initializes with appropriate values. Open the JackPilot application set your system specific default settings.
At this point the only relevant JACK audio server setting is the sampling frequency, which has to match the sampling frequency of your rendered audio source file or stream (no resampling will be applied for that specific file).

FFTW optimization — In case the rendering takes very long to start (after the message "initializing FFTW DFT optimization ..."), you might want to endure this long computation time once (per rendering configuration) or lower your FFTW planner effort (see --help).

Rendering performance — Follow these remarks to expect continuous and artifact free rendering:

Optional components like array pre-rendering, headphone equalization, noise generation, etc. will save performance in case they are not deployed.
Extended IR lengths (particularly for modes with an array IR pre-rendering) will massively increase the computational load depending on the chosen block size (partitioned convolution).
Currently, there is no partitioned convolution for the main binaural renderer with SH based processing, hence the FIR taps of the applied HRIR, Modal Radial Filters and further compensations (e.g. Spherical Head Filter) need to cumulatively fit inside the chosen block size.
Higher block size means lower computational load in real-time rendering but also increased system latency, most relevant for modes with array live-stream rendering, but also all other modes in terms of a slightly "smeared" head-tracking experience (noticeable at 4096 samples).
Adjust output levels of all rendering components (default parameters chosen accordingly) to prevent signal clipping (indicated by warning messages during execution).
Check JACK system load (e.g. JackPilot or OSC_Remote_Demo.pd) to be below approx. 95% load, in order to prevent dropouts (i.e. the OS reported overall system load is not a good indicator).
Check JACK detected dropouts ("xruns" indicated during execution).
Most of all, use your ears! If something sounds strange, there is probably something going wrong... ;)

Always check the command line output or generated log files in case the rendering pipeline does not initialize successfully!

Execution parameters

The following parameters are all optional and available in combinations with the named execution modes subsequently:

Run with a specific processing block size (choose the value according to the individual rendering configuration and performance of your system)
- The largest block size (the best performance but noticeable input latency):
  python -m ReTiSAR -b=4096 [default]
- Try smaller block sizes according to the specific rendering configuration and individual system performance:
  python -m ReTiSAR -b=1024
  python -m ReTiSAR -b=256
Run with a specific processing word length
- Single precision 32 bit (better performance):
  python -m ReTiSAR -SP=TRUE [default]
- Double precision 64 bit (no configuration with an actual benefit is known):
  python -m ReTiSAR -SP=FALSE
Run with a specific IR truncation cutoff level (applied to all IRs)
- Cutoff -60 dB under peak (better performance and perceptually irrelevant in most cases):
  python -m ReTiSAR -irt=-60 [default]
- No cutoff to render the entire IR (this constitutes tough performance requirements in the case of rendering array IRs with long reverberation):
  python -m ReTiSAR -irt=0 [applied in all scientific evaluations]
Run with a specific head-tracking device (paths are system dependent!)
- No tracking (head movement can be remote controlled):
  python -m ReTiSAR -tt=NONE [default]
- Automatic rotation:
  python -m ReTiSAR -tt=AUTO_ROTATE
- Tracker Razor AHRS:
  python -m ReTiSAR -tt=RAZOR_AHRS -t=/dev/tty.usbserial-AH03F9XC
- Tracker Polhemus Patriot:
  python -m ReTiSAR -tt=POLHEMUS_PATRIOT -t=/dev/tty.UC-232AC
- Tracker Polhemus Fastrack:
  python -m ReTiSAR -tt=POLHEMUS_FASTRACK -t=/dev/tty.UC-232AC
Run with a specific HRTF dataset as MIRO [6] or SOFA [7] files
- Neumann KU100 artificial head from [6] as SOFA:
  python -m ReTiSAR -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA [default]
- Neumann KU100 artificial head from [6] as MIRO:
  python -m ReTiSAR -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir_struct.mat -hrt=HRIR_MIRO
- Neumann KU100 artificial head from [8] as SOFA:
  python -m ReTiSAR -hr=res/HRIR/KU100_SADIE2/48k_24bit_256tap_8802dir.sofa -hrt=HRIR_SOFA
- GRAS KEMAR artificial head from [8] as SOFA:
  python -m ReTiSAR -hr=res/HRIR/KEMAR_SADIE2/48k_24bit_256tap_8802dir.sofa -hrt=HRIR_SOFA
- FABIAN artificial head from [9] as SOFA:
  python -m ReTiSAR -hr=res/HRIR/FABIAN_TUB/44k_32bit_256tap_11950dir_HATO_0.sofa -hrt=HRIR_SOFA
- Employ an arbitrary (artificial or individual) dataset by providing a relative / absolute path!
- The length of the employed HRIR dataset constrains the minimum usable rendering block size!
- Mismatched IRs with a sampling frequency different to the source material will be resampled!
Run with a specific headphone equalization / compensation filters (arbitrary filter length). The compensation filter should match the used headphone model or even the individual headphone. In the best case scenario, the filter was also gathered on the identical utilized HRIR (artificial or individual head).
- No individual headphone compensation:
  python -m ReTiSAR -hp=NONE [default]
- Beyerdynamic DT990 headphone on Neumann KU100 artificial head from [8]:
  python -m ReTiSAR -hp=res/HPCF/KU100_SADIE2/48k_24bit_1024tap_Beyerdynamic_DT990.wav
- Beyerdynamic DT990 headphone on GRAS KEMAR artificial head from [8]:
  python -m ReTiSAR -hp=res/HPCF/KEMAR_SADIE2/48k_24bit_1024tap_Beyerdynamic_DT990.wav
- AKG K701 headphone on FABIAN artificial head from [9]:
  python -m ReTiSAR -hp=res/HPCF/FABIAN_TUB/44k_32bit_4096tap_AKG_K701.wav
- Sennheiser HD800 headphone on FABIAN artificial head from [9]:
  python -m ReTiSAR -hp=res/HPCF/FABIAN_TUB/44k_32bit_4096tap_Sennheiser_HD800.wav
- Sennheiser HD600 headphone on GRAS KEMAR artificial head from TU Rostock:
  python -m ReTiSAR -hp=res/HPCF/KEMAR_TUR/44k_24bit_2048tap_Sennheiser_HD600.wav
- Check the res/HPCF/. directory for numerous other headphone models or employ arbitrary (artificial or individual) compensation filters by providing a relative / absolute path!
- Mismatched IRs with a sampling frequency different to the source material will be resampled!
Run with a specific SH processing compensation techniques (relevant for rendering modes utilizing spherical harmonics)
- Modal Radial Filters [always applied] with an individual amplification soft-limiting in dB according to [3]:
  python -m ReTiSAR -arr=18 [default]
- Spherical Head Filter according to [4]:
  python -m ReTiSAR -sht=SHF
- Spherical Harmonics Tapering in combination with the Spherical Head Filter according to [5]:
  python -m ReTiSAR -sht=SHT+SHF [default]
Run with some specific emulated self-noise as additive component to each microphone array sensor (the performance requirements increase according to channel count)
- No noise (yielding the best performance):
  python -m ReTiSAR -gt=NONE [default]
- White noise (also setting the initial output level and mute state of the rendering component):
  python -m ReTiSAR -gt=NOISE_WHITE -gl=-30 -gm=FALSE
- Pink noise by IIR filtering (higher performance requirements):
  python -m ReTiSAR -gt=NOISE_IIR_PINK -gl=-30 -gm=FALSE
- Eigenmike noise coloration by IIR filtering from [10]:
  python -m ReTiSAR -gt=NOISE_IIR_EIGENMIKE -gl=-30 -gm=FALSE
For further configuration parameters, check Alternative 1 and Alternative 2 above!

Execution modes

This section list all the conceptually different rendering modes of the pipeline. Most of the other beforehand introduced execution parameters can be combined with the mode-specific parameters. In case no manual value for all specific rendering parameters is provided (as in the following examples), their respective default values will be used.

Most execution modes require additional external measurement data, which cannot be republished here. However, all provided examples are based on publicly available research data. Respective files are represented here by provided source reference files (see res/), containing a source URL and potentially further instructions. In case the respective resource data file is not yet available on your system, download instructions will be shown in the command line output and generated log files.

Run as array recording renderer
- Eigenmike at Chalmers lab space with speaker moving horizontally around the array:
  python -m ReTiSAR -sh=4 -tt=NONE -s=res/record/EM32ch_lab_voice_around.wav -ar=res/ARIR/RT_calib_EM32ch_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA [default]
- Eigenmike at Chalmers lab space with speaker moving vertically in front of the array:
  python -m ReTiSAR -sh=4 -tt=NONE -s=res/record/EM32ch_lab_voice_updown.wav -ar=res/ARIR/RT_calib_EM32ch_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- Zylia ZM-1 at TH Cologne office (recording file not provided!):
  python -m ReTiSAR -b=512 -sh=3 -tt=NONE -s=res/record/ZY19_off_around.wav -sl=9 -ar=res/ARIR/RT_calib_ZY19_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- HØSMA-7N at TH Cologne lecture hall (recording file not provided!):
  python -m ReTiSAR -b=2048 -sh=7 -tt=NONE -s=res/record/HOS64_hall_lecture.wav -sp="[(90,0)]" -sl=9 -ar=res/ARIR/RT_calib_HOS64_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
Run as array live-stream renderer with minimum latency (e.g. Eigenmike with the respective channel calibration provided by the manufacturer)
- Eigenmike Chalmers EM32 (SN 28):
  python -m ReTiSAR -b=512 -sh=4 -tt=NONE -s=None -ar=res/ARIR/RT_calib_EM32ch_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- Eigenmike Facebook Reality Labs EM32 (SN ??):
  python -m ReTiSAR -b=512 -sh=4 -tt=NONE -s=None -ar=res/ARIR/RT_calib_EM32frl_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- Zylia ZM-1:
  python -m ReTiSAR -b=512 -sh=3 -tt=NONE -s=None -ar=res/ARIR/RT_calib_ZY19_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- TH Cologne HØSMA-7N:
  python -m ReTiSAR -b=2048 -sh=7 -tt=NONE -s=None -ar=res/ARIR/RT_calib_HOS64_struct.mat -art=AS_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
Run as array IR renderer, e.g. Eigenmike
- Simulated plane wave:
  python -m ReTiSAR -sh=4 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_sim_EM32_PW_struct.mat -art=ARIR_MIRO -arl=-6 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- Anechoic measurement:
  python -m ReTiSAR -sh=4 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_anec_EM32ch_S_struct.mat -art=ARIR_MIRO -arl=0 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
Run as array IR renderer, e.g. sequential VSA measurements from [11] at the maximum respective SH order (different room, source positions and array configurations are available in res/ARIR/)
- 50ch (sh5), LBS center:
  python -m ReTiSAR -sh=5 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_LBS_VSA_50RS_PAC.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- 86ch (sh7), SBS center:
  python -m ReTiSAR -sh=7 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_SBS_VSA_86RS_PAC.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- 110ch (sh8), CR1 left:
  python -m ReTiSAR -sh=8 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -sp="[(-37,0)]" -ar=res/ARIR/DRIR_CR1_VSA_110RS_L.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- 194ch (sh11, open sphere, cardioid microphones), LBS center:
  python -m ReTiSAR -sh=11 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -ar=res/ARIR/DRIR_LBS_VSA_194OSC_PAC.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA
- 1202ch (truncated sh12), CR7 left:
  python -m ReTiSAR -sh=12 -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -sp="[(-37,0)]" -ar=res/ARIR/DRIR_CR7_VSA_1202RS_L.sofa -art=ARIR_SOFA -arl=-12 -hr=res/HRIR/KU100_THK/48k_32bit_128tap_2702dir.sofa -hrt=HRIR_SOFA

Note that the rendering performance is mostly determined by the chosen combination of the following parameters: number of microphones (ARIR channels), room reverberation time (ARIR length), IR truncation cutoff level and rendering block size.

Run as BRIR renderer (partitioned convolution in frequency domain) for any BRIR compatible to the SoundScape Renderer, e.g. pre-processed array IRs by [12]:
python -m ReTiSAR -tt=AUTO_ROTATE -s=res/source/Drums_48.wav -art=NONE -hr=res/HRIR/KU100_THK/BRIR_CR1_VSA_110RS_L_SSR_SFA_-37_SOFA_RFI.wav -hrt=BRIR_SSR -hrl=-12
Run as "binauralizer" for an arbitrary number of virtual sound sources via HRTF (partitioned convolution in frequency domain) for any HRIR compatible to the SoundScape Renderer:
python -m ReTiSAR -tt=AUTO_ROTATE -s=res/source/PinkMartini_Lilly_44.wav -sp="[(30, 0),(-30, 0)]" -art=NONE -hr=res/HRIR/FABIAN_TUB/hrirs_fabian.wav -hrt=HRIR_SSR (provide respective source file and source positions!)

Remote Control

During runtime, certain parameters of the application can be remote controlled via Open Sound Control. Individual clients can be accessed by targeting them with specific OSC commands on port 5005 [default].
Depending on the current configuration and rendering mode different commands are available i.e., arbitrary combinations of the following targets and values:
/generator/volume 0, /generator/volume -12 (set any client output volume in dBFS),
/prerenderer/mute 1, /prerenderer/mute 0, /prerenderer/mute -1, /prerenderer/mute (set/toggle any client mute state),
/hpeq/passthrough true, /hpeq/passthrough false, /hpeq/passthrough toggle (set/toggle any client passthrough state)
The target name is derived from the individual JACK client name for all commands, while the order of target client and command can be altered, while further commands might be available:
/renderer/crossfade, /crossfade/renderer (set/toggle the crossfade state),
/renderer/delay 350.0 (set an additional input delay in ms),
/renderer/order 0, /renderer/order 4 (set the SH rendering order),
/tracker/zero (calibrate the tracker), /tracker/azimuth 45 (set the tracker orientation),
/player/stop, /player/play, /quit (quit all rendering components)
During runtime, individual JACK clients with their respective "target" name also report real-time feedback or analysis data on port 5006 [default] in the specified exemplary data format (number of values depends on output ports) i.e., arbitrary combinations of the name and parameters:
/player/rms 0.0, /generator/peak 0.0 0.0 0.0 0.0 (the current audio output metrics),
/renderer/load 100 (the current client load),
/tracker/AzimElevTilt 0.0 0.0 0.0 (the current head orientation),
/load 100 (the current JACK system load)
In the package included is an example remote control client implemented for "vanilla" PD, see further instructions in OSC_Remote_Demo.pd.

Validation - Setup and Execution

Download and build required ecasound library for signal playback and capture with JACK support:
in directory ./configure, make and sudo make install while having JACK installed
Optional: Install sendosc tool to be used for automation in shell scripts:
brew install yoggy/tap/sendosc
Remark: Make sure all subsequent rendering configurations are able to start up properly before recording starts (particularly FFTW optimization might take a long time, see above)
Validate impulse responses by comparing against a reference implementation, in this case the output of sound_field_analysis-py [11]
- Execute recording script, consecutively starting the package and capturing impulse responses in different rendering configurations:
  ./res/research/validation/record_ir.sh
  Remark: Both implementations compensate the source being at an incidence angle of -37 degrees in the measurement IR set
- Run package in validation mode, executing a comparison of all beforehand captured IRs in res/research/validation/ against the provided reference IRs:
  python -m ReTiSAR --VALIDATION_MODE=res/HRIR/KU100_THK/BRIR_CR1_VSA_110RS_L_SSR_SFA_-37_SOFA_RFI.wav
Validate signal-to-noise-ratio by comparing input and output signals of the main binaural renderer for wanted target signals and emulated sensor self-noise respectively
- Execute recording script consecutively starting the package and capturing target-noise as well as self-noise input and output signals in different rendering configurations:
  ./res/research/validation/record_snr.sh
- Open (and run) MATLAB analysis script to execute an SNR comparison of beforehand captured signals:
  open ./res/research/validation/calculate_snr.m

Benchmark - Setup and Execution

Install additionally required Python packages into Conda environment:
conda env update --file environment_dev.yml
Run the JACK server with arbitrary sampling rate via JackPilot or in a new command line window ([CMD]+[T]):
jackd -d coreaudio
Run in benchmark mode, instantiating one rendering JACK client with as many convolver instances as possible (40-60 minutes):
python -m ReTiSAR --BENCHMARK_MODE=PARALLEL_CONVOLVERS
Run in benchmark mode, instantiating as many rendering JACK clients as possible with one convolver instance (10-15 minutes):
python -m ReTiSAR --BENCHMARK_MODE=PARALLEL_CLIENTS
Find generated results in the specified files at the end of the script.

References

[1] H. Helmholz, C. Andersson, and J. Ahrens, “Real-Time Implementation of Binaural Rendering of High-Order Spherical Microphone Array Signals,” in Fortschritte der Akustik -- DAGA 2019, 2019, pp. 1462–1465.
[2] H. Helmholz, T. Lübeck, J. Ahrens, S. V. A. Garí, D. Lou Alon, and R. Mehra, “Updates on the Real-Time Spherical Array Renderer (ReTiSAR),” in Fortschritte der Akustik -- DAGA 2020, 2020, pp. 1169–1172.
[3] B. Bernschütz, C. Pörschmann, S. Spors, and S. Weinzierl, “SOFiA Sound Field Analysis Toolbox,” in International Conference on Spatial Audio, 2011, pp. 7–15.
[4] C. Hold, H. Gamper, V. Pulkki, N. Raghuvanshi, and I. J. Tashev, “Improving Binaural Ambisonics Decoding by Spherical Harmonics Domain Tapering and Coloration Compensation,” in International Conference on Acoustics, Speech and Signal Processing, 2019, pp. 261–265, doi: 10.1109/ICASSP.2019.8683751.
[5] Z. Ben-Hur, F. Brinkmann, J. Sheaffer, S. Weinzierl, and B. Rafaely, “Spectral equalization in binaural signals represented by order-truncated spherical harmonics,” J. Acoust. Soc. Am., vol. 141, no. 6, pp. 4087–4096, 2017, doi: 10.1121/1.4983652.
[6] B. Bernschütz, “A spherical far field HRIR/HRTF compilation of the Neumann KU 100,” in Fortschritte der Akustik -- AIA/DAGA 2013, 2013, pp. 592–595.
[7] P. Majdak et al., “Spatially Oriented Format for Acoustics: A Data Exchange Format Representing Head-Related Transfer Functions,” in AES Convention 134, 2013, pp. 262–272.
[8] C. Armstrong, L. Thresh, D. Murphy, and G. Kearney, “A Perceptual Evaluation of Individual and Non-Individual HRTFs: A Case Study of the SADIE II Database,” Appl. Sci., vol. 8, no. 11, pp. 1–21, 2018, doi: 10.3390/app8112029.
[9] F. Brinkmann et al., “The FABIAN head-related transfer function data base.” Technische Universität Berlin, Berlin, Germany, 2017, doi: 10.14279/depositonce-5718.5.
[10] H. Helmholz, D. Lou Alon, S. V. A. Garí, and J. Ahrens, “Instrumental Evaluation of Sensor Self-Noise in Binaural Rendering of Spherical Microphone Array Signals,” in Forum Acusticum, 2020, pp. 1349–1356, doi: 10.48465/fa.2020.0074.
[11] P. Stade, B. Bernschütz, and M. Rühl, “A Spatial Audio Impulse Response Compilation Captured at the WDR Broadcast Studios,” in 27th Tonmeistertagung -- VDT International Convention, 2012, pp. 551–567.
[12] C. Hohnerlein and J. Ahrens, “Spherical Microphone Array Processing in Python with the sound_field_analysis-py Toolbox,” in Fortschritte der Akustik -- DAGA 2017, 2017, pp. 1033–1036.
[13] H. Helmholz, J. Ahrens, D. Lou Alon, S. V. A. Garí, and R. Mehra, “Evaluation of Sensor Self-Noise In Binaural Rendering of Spherical Microphone Array Signals,” in International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 161–165, doi: 10.1109/ICASSP40776.2020.9054434.

Change Log

v2021.03.30
- Addition of Zylia ZM-1 array example recording and live-stream configurations
v2020.11.23
- Consolidation of Python 3.9 compatibility
- Consolidation of Linux compatibility (no modifications were required; tested with Jack 1.9.16 on kernel 5.9.1-1-rt19-MANJARO)
v2020.10.21
- Improvement of establishing JACK and/or OS specific client name length limitation of JackClient
v2020.10.15 (v2020.FA)
- Addition of references to data set for Forum Acusticum [10] publication
v2020.9.10
- Enforcement of Black >= 20.8b1 code style
v2020.8.20
- Extension of JackPlayer to make auto-play behaviour configurable (via config parameter or command line argument)
v2020.8.13
- Update of OSC Remote Demo to reset (after not receiving data) and clip displayed RMS values
- Improvement of FFTW wisdom verification to be more error proof
v2020.7.16
- Addition of experimental SECTORIAL_DEGREE_SELECTION and EQUATORIAL_DEGREE_SELECTION SH weighting techniques (partial elimination of HRIR elevation queues)
- Update of plotting during and after application of SH compensation techniques
v2020.7.7
- Addition of HRIR and HPCF source files to SADIE II database [8]
- Extension of DataRetriever to automatically extract requested resources from downloaded *.zip archives
v2020.7.4
- Introduction of FFTW wisdom file signature verification (in order to update any already accumulated wisdom run with --PYFFTW_LEGACY_FILE=log/pyfftw_wisdom.bin once)
- Fixes for further SonarLint security and code style recommendations
v2020.7.1
- Update and addition of further WDR Cologne ARIR source files (linking to Zenodo data set)
- Hack for Modal Radial Filters generation in open / cardioid SMA configurations (unfortunately this metadata is not directly available in the SOFA ARIR files)
v2020.4.8
- Improvement of IIR pink noise generation (continuous utilization of internal filter delay conditions)
- Improvement of IIR pink noise generation (employment of SOS instead of BA coefficients)
- Addition of IIR Eigenmike coloration noise generation according to [10]
v2020.4.3
- Improvement of white noise generation (vastly improved performance due to numpy SFC64 generator)
- Enabling of JackGenerator (and derivatives) to operate in single precision for improved performance
v2020.3.3
- Addition of further simulated array data sets
v2020.2.24
- Consolidation of Python 3.8 compatibility
- Introduction of multiprocessing context for compatibility
- Enforcement of Black code style
v2020.2.14
- Addition of TH Cologne HØSMA-7N array configuration
v2020.2.10
- Addition of project community information (contributing, code of conduct, issue templates)
v2020.2.7
- Extension of DataRetriever to automatically download data files
- Addition of missing ignored project resources
v2020.2.2
- Change of default rendering configuration to contained Eigenmike recording
- Update of README structure (including Quickstart section)
v2020.1.30
- First publication of code
Pre-release (v2020.ICASSP)
- Contains the older original code state for the ICASSP [13] publication
Pre-release (v2019.DAGA)
- Contains the older original code state for the initial DAGA [1] publication

Contributing

See CONTRIBUTING for full details.

Credits

Written by Hannes Helmholz.

Scientific supervision by Jens Ahrens.

Contributions by Carl Andersson and Tim Lübeck.

This work was funded by Facebook Reality Labs.

License

This software is licensed under a Non-Commercial Software License (see LICENSE for full details).

v2021.TASLP(Jan 9, 2022)

H. Helmholz, D. Lou Alon, S. V. Amengual Garí, and J. Ahrens, “Effects of Additive Noise in Binaural Rendering of Spherical Microphone Array Signals,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 29, pp. 3642–3653, 2021, doi: 10.1109/TASLP.2021.3129359.

The code, data structures and included resources are directly based on the continuously developed master branch around the time of this publication.

The procedures utilized to capture rendered ear signals, the resulting raw data, the scripts realizing the Composite Loudness Levels analysis, the intermediate ILD, CLL and CLLgd plots, as well as the scripts to generate the respective result plots are available at: http://doi.org/10.5281/zenodo.3842305
Source code(tar.gz)
Source code(zip)
v2020.FA(Oct 15, 2020)

H. Helmholz, D. Lou Alon, S. V. Amengual Garí, and J. Ahrens, “Instrumental Evaluation of Sensor Self-Noise in Binaural Rendering of Spherical Microphone Array Signals,” in Forum Acusticum, 2020, pp. 1349–1356, doi: 10.48465/fa.2020.0074.

The code, data structures and included resources are directly based on the continuously developed master branch around the time of this publication.

The methods for the instrumental SNR evaluation, gathered results and according visualizations, as well as mh acoustics Eigenmike 32 self-noise measurement scripts, recorded results and according visualizations utilized in the publication are available at: http://doi.org/10.5281/zenodo.3711626
Source code(tar.gz)
Source code(zip)
v2020.ICASSP(May 6, 2020)

H. Helmholz, J. Ahrens, D. Lou Alon, S. V. Amengual Garí, and R. Mehra, “Evaluation of Sensor Self-Noise In Binaural Rendering of Spherical Microphone Array Signals,” in International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 161–165, doi: 10.1109/ICASSP40776.2020.9054434.

Note that this is an early state of the publication, where the code base, data structures and included resources could be very different from current releases. Accordingly the original project structure is provided here for reference.

The methods for the instrumental evaluation, gathered results and according visualizations utilized in the publication are available at: http://doi.org/10.5281/zenodo.3661422
Source code(tar.gz)
Source code(zip)
v2019.DAGA(May 6, 2020)

H. Helmholz, C. Andersson, and J. Ahrens, “Real-Time Implementation of Binaural Rendering of High-Order Spherical Microphone Array Signals,” in Fortschritte der Akustik -- DAGA 2019, 2019, pp. 1462–1465.

Note that this is an early state of the publication, where the code base, data structures and included resources could be very different from current releases. Accordingly the original project structure is provided here for reference.

The methods for the instrumental evaluation, gathered results and according visualizations utilized in the publication are included here.
Source code(tar.gz)
Source code(zip)

Real-Time Spherical Microphone Renderer for binaural reproduction in Python

Related tags

Overview

ReTiSAR

Requirements

Setup

Quickstart

Execution parameters

Execution modes

Remote Control

Validation - Setup and Execution

Benchmark - Setup and Execution

References

Change Log

Contributing

Credits

License

You might also like...

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

Scalable audio processing framework written in Python with a RESTful API

Python module for handling audio metadata

Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3

Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Expressive Digital Signal Processing (DSP) package for Python

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python

Python wrapper around sox.

Python I/O for STEM audio files

Releases(v2021.TASLP)

v2021.TASLP(Jan 9, 2022)

v2020.FA(Oct 15, 2020)

v2020.ICASSP(May 6, 2020)

v2019.DAGA(May 6, 2020)

Owner

Division of Applied Acoustics at Chalmers University of Technology

Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

Music generation using ml / dl

Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

A python wrapper for REAPER

Port Hitsuboku Kumi Chinese CVVC voicebank to deepvocal. / 筆墨クミDeepvocal中文音源

Code to work with wave files!

praudio provides audio preprocessing framework for Deep Learning audio applications

Sync Toolbox - Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (DTW)

Extract the songs from your osu! libary into proper mp3 form, complete with metadata and album art!

Spotifyd - An open source Spotify client running as a UNIX daemon.

Royal Music You can play music and video at a time in vc

Read music meta data and length of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python 2 or 3

Python library for handling audio datasets.

Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

A voice control utility for Spotify

A collection of python scripts for extracting and analyzing acoustics from audio files.

Anki vector Music ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

A python package for calculating the PESQ.

Delta TTA(Text To Audio) SoftWare