Python wrapper around sox.

Related tags

Audiopysox
Overview

pysox

Python wrapper around sox. Read the Docs here.

PyPI version Documentation Status GitHub license PyPI

Build Status Coverage Status

PySocks

This library was presented in the following paper:

R. M. Bittner, E. J. Humphrey and J. P. Bello, "pysox: Leveraging the Audio Signal Processing Power of SoX in Python", in Proceedings of the 17th International Society for Music Information Retrieval Conference Late Breaking and Demo Papers, New York City, USA, Aug. 2016.

Install

This requires that SoX version 14.4.2 or higher is installed.

To install SoX on Mac with Homebrew:

brew install sox

If you want support for mp3, flac, or ogg files, add the following flags:

brew install sox --with-lame --with-flac --with-libvorbis

on Linux:

apt-get install sox

or install from source.

To install the most up-to-date release of this module via PyPi:

pip install sox

To install the master branch:

pip install git+https://github.com/rabitt/pysox.git

or

git clone https://github.com/rabitt/pysox.git
cd pysox
python setup.py install

Tests

If you have a different version of SoX installed, it's recommended that you run the tests locally to make sure everything behaves as expected, by simply running:

pytest

Examples

import sox
# create transformer
tfm = sox.Transformer()
# trim the audio between 5 and 10.5 seconds.
tfm.trim(5, 10.5)
# apply compression
tfm.compand()
# apply a fade in and fade out
tfm.fade(fade_in_len=1.0, fade_out_len=0.5)
# create an output file.
tfm.build_file('path/to/input_audio.wav', 'path/to/output/audio.aiff')
# or equivalently using the legacy API
tfm.build('path/to/input_audio.wav', 'path/to/output/audio.aiff')
# get the output in-memory as a numpy array
# by default the sample rate will be the same as the input file
array_out = tfm.build_array(input_filepath='path/to/input_audio.wav')
# see the applied effects
tfm.effects_log
> ['trim', 'compand', 'fade']

Transform in-memory arrays:

import numpy as np
import sox
# sample rate in Hz
sample_rate = 44100
# generate a 1-second sine tone at 440 Hz
y = np.sin(2 * np.pi * 440.0 * np.arange(sample_rate * 1.0) / sample_rate)
# create a transformer
tfm = sox.Transformer()
# shift the pitch up by 2 semitones
tfm.pitch(2)
# transform an in-memory array and return an array
y_out = tfm.build_array(input_array=y, sample_rate_in=sample_rate)
# instead, save output to a file
tfm.build_file(
    input_array=y, sample_rate_in=sample_rate,
    output_filepath='path/to/output.wav'
)
# create an output file with a different sample rate
tfm.set_output_format(rate=8000)
tfm.build_file(
    input_array=y, sample_rate_in=sample_rate,
    output_filepath='path/to/output_8k.wav'
)

Concatenate 3 audio files:

import sox
# create combiner
cbn = sox.Combiner()
# pitch shift combined audio up 3 semitones
cbn.pitch(3.0)
# convert output to 8000 Hz stereo
cbn.convert(samplerate=8000, n_channels=2)
# create the output file
cbn.build(
    ['input1.wav', 'input2.wav', 'input3.wav'], 'output.wav', 'concatenate'
)
# the combiner does not currently support array input/output

Get file information:

import sox
# get the sample rate
sample_rate = sox.file_info.sample_rate('path/to/file.mp3')
# get the number of samples
n_samples = sox.file_info.num_samples('path/to/file.wav')
# determine if a file is silent
is_silent = sox.file_info.silent('path/to/file.aiff')
# file info doesn't currently support array input
Comments
  • Version1.2

    Version1.2

    • [x] fix set_input_format in Combiner
    • [x] implement stretch
    • [x] implement speed
    • [x] implement sinc
    • [x] implement phaser
    • [x] implement mcompand
    • [x] implement echos
    • [x] implement echo
    • [x] implement bend
    opened by rabitt 18
  • Use Transformer in-memory with stdin/stdout

    Use Transformer in-memory with stdin/stdout

    This PR tries to address #6. I basically followed the code by @carlthome in pysndfx here: https://github.com/carlthome/python-audio-effects/blob/master/pysndfx/dsp.py#L472. I followed the discussion in the issue and implemented a new function that belongs to Transformer called build_array, which takes in a numpy array. The main issue is keeping track of all of the information that would normally be in the header of the audio file. Instead, these have to be passed as arguments to build_array, both for the input and output.

    I modified all of the tests. Each piece of the Transformer is tested by taking an input file and passing it through sox to write to an output file. So in the tests, what I do is load both the input file and the output file as numpy arrays, pass the input array into tfm.build_array and collect the output array (which is the second argument in the tuple, matching the API of tfm.build). Then, I do an np.allclose between the loaded output array from the output file and the output array. Every test was modified in this way, except for one, which I can't get working in the time I tried to put this together today. That test is test_bitdepth_valid, here: https://github.com/pseeth/pysox/blob/master/tests/test_transform.py#L1347.

    This change to the tests is encapsulated in a single function tfm_assert_array_to_file_output which you'll see calls to sprinkled throughout the test.

    To do all this, I modified the sox function in a backwards compatible way. I probably need to up coverage a bit still but this is hopefully okay to PR now for feedback.

    Let me know if this is a good start and how to get this merged, if possible! It would go a long way to making pysox super efficient for Scaper, thus why I'm here. :)

    Thanks!

    opened by pseeth 15
  • Use directly in memory, instead of via files, possible?

    Use directly in memory, instead of via files, possible?

    Would it be possible to input audio as ndarray:s into pysox directly as well as files from disk? Why I'm asking is because I'm using librosa for onset detection, and thus already have the audio loaded, but would still like to apply some audio effects and stuff afterwards.

    Like maybe the constructors could do a type check, and build() could return the destination audio or something?

    y = librosa.load(path)[0]
    tfm = sox.Transformer(y)
    # Do stuff...
    y = tfm.build()
    

    I realize this adds extra complexity to pysox (like having librosa as a dependency or similar) which is intended to be a clean wrapper around SoX, so I get if it's deemed out of scope. Just asking!

    enhancement 
    opened by carlthome 14
  • [MRG] bugfix file_info.bitrate (b -> B in sox flag), define bitdepth, return None when file info is NA

    [MRG] bugfix file_info.bitrate (b -> B in sox flag), define bitdepth, return None when file info is NA

    file_info.bitrate used to call soxi with flag -b rather than -B Yes, -b is not the bit rate but the bit depth This PR changes -b into -B in file_info.bitrate It also defines a new function, file_info.bitdepth, which does what bitrate did earlier I updated the tests accordingly Closes #68, Closes #78, closes #84

    bug 
    opened by lostanlen 13
  • Handle pathlib.Path path types

    Handle pathlib.Path path types

    Since python 3.4 there has been a new object-oriented interface to handling paths. This PR adds support for pathlib.Path for functions which take a path as input. I make heavy use of pathlib.Path in my code these days, and while I can do the cast to string myself, I find that more and more libraries are supporting pathlib, and so I offer this PR.

    In practice, since pathlib.Path is implicitly cast to string, there are only a couple of minor places where this is important - any function which performs a string-based operation on a variable which could be a path.

    In order to make sure the tests pass, because this a python3-only feature, I also removed 2.7 from your .travis.yml file. Since python2 is EOL, and your readme doesn't include a python2 badge in the top, it seemed like this should be okay. If not, let me know and I'll see if I can come up with some test flags for python2 which would skip these tests or something.

    opened by cjacoby 11
  • Support using same file path for input & output

    Support using same file path for input & output

    Sox doesn't support using the same file as both input and output - doing this will result in an empty, invalid audio file. While this is sox behavior and not pysox, it would be nice if pysox took care of this behind the scenes. Right now the user needs to worry about this logic themselves, e.g. like this:

    import tempfile
    import shutil
    from scaper.util import _close_temp_files
    
    audio_infile = '/Users/justin/Downloads/trimtest.wav'
    audio_outfile = '/Users/justin/Downloads/trimtest.wav'
    start_time = 2
    end_time = 3
    
    tfm = sox.Transformer()
    tfm.trim(start_time, end_time)
    if audio_outfile != audio_infile:
        tfm.build(audio_infile, audio_outfile)
    else:
        # must use temp file in order to save to same file
        tmpfiles = []
        with _close_temp_files(tmpfiles):
            # Create tmp file
            tmpfiles.append(
                tempfile.NamedTemporaryFile(
                    suffix='.wav', delete=True))
            # Save trimmed result to temp file
            tfm.build(audio_infile, tmpfiles[-1].name)
            # Copy result back to original file
            shutil.copyfile(tmpfiles[-1].name, audio_outfile)
    

    Pysox does issue a warning when a file is about to be overwritten, which is even more confusing under this scenario since the user (who might be unfamiliar with the quirks of sox) has no reason to think that the overwritten file will be invalid.

    opened by justinsalamon 11
  • Remove grep dependency, parse help string in python

    Remove grep dependency, parse help string in python

    Calling grep breaks pysox (and subsequently) scaper on windows. This PR addresses this be removing the use of grep altogether and parsing the sox help string directly in python.

    Addresses issue #66 in pysox, issue justinsalamon/scaper#25 in scaper

    opened by justinsalamon 10
  • Implement Transformer:noiseprof and noisered

    Implement Transformer:noiseprof and noisered

    Implement #37. Also, I found we need a method to clear effects in Transformer so we do not need to instancing a new tfs.

    Also noiseprof effect is very special as the output prof file name is not after input filename but after noiseprof. In shell the whole command should be something like sox noise.mp3 -n noiseprof noise.prof where -n is a null file. I really do not understand sox works that way and the -n seems to be useless. As -n is in the position of output filename so the real output file is sentenced in noiseprof() function.

    Example:

    >>> import sox
    >>> tfs = sox.Transformer()
    >>> tfs.noiseprof()
    <sox.transform.Transformer object at 0x7f140c221250>
    >>> tfs.build('/home/lancaster/pb-4.wav', '-n')
    WARNING:root:This install of SoX cannot process . files.
    True
    >>> tfs.effects
    ['noiseprof', 'noise.prof']
    >>> tfs.clear_effects()
    >>> tfs.effects
    []
    >>> tfs.noisered(amount=0.3)
    WARNING:root:This install of SoX cannot process .prof files.
    <sox.transform.Transformer object at 0x7f140c221250>
    >>> tfs.build('/home/lancaster/pb-4.wav', '/home/lancaster/clear.wav')
    WARNING:root:output_file: /home/lancaster/clear.wav already exists and will be overwritten on build
    True
    

    It works well during my test, I am new to PR and not familiar with coveralls so I just test it in console... I didn't find why these WARNING exists. They may need to be fixed later...

    opened by Page-David 9
  • Updated build API for 1.4.0

    Updated build API for 1.4.0

    Implements option B from issue #106

    • build_file takes array or file inputs and generates file outputs, returning a boolean
    • build_array takes array or file inputs and returns np.ndarray outputs

    build_file is an alias for build, and I moved the documentation to point people towards using build_file, while keeping build fully functional.

    I also did a little cleanup of how set_input_format and set_output_format work - rather than storing the list of format arguments to be passed to sox, it stores a dictionary which is parsed when running build/build_array, which cleans up the build functions substantially.

    opened by rabitt 6
  • v1.4.0 build api

    v1.4.0 build api

    So far, in 1.4.0a0, we've overloaded the build API to support 4 cases:

    1. file in, file out status = tfm.build(input_filepath=..., output_filepath=...)
    2. file in, array out status, out, err = tfm.build(input_filepath=...)
    3. array in, file out status = tfm.build(input_array=..., sample_rate_in=..., output_filepath=...)
    4. array in, array out status, out, err = tfm.build(input_array=..., sample_rate_in=...)

    The pros:

    • it's backwards compatible
    • it's all in one function

    The cons:

    • the returns change in a confusing way as a function of the combination of inputs
    • only certain combinations of inputs are valid

    Do we want to keep it this way? Some other alternatives: A. write 4 separate functions. We keep build as file in-file out for backwards compatiblity and define 3 new functions for the other cases B. write 2 separate functions, one for file output and one for array output, keeping the input argument logic (e.g. build and build_array, similar to @pseeth 's original PR) in order to support either file or array input for both functions C. keep build as it is currently, and write 4 additional wrapper functions for each of the 4 cases that internally call build ... other ideas?

    Would love to hear people's thoughts. cc @pseeth @nicolamontecchio @psobot @carlthome @hadware @lostanlen

    question 
    opened by rabitt 6
  • Switch to using a named logger so callers can configure log levels.

    Switch to using a named logger so callers can configure log levels.

    Closes #51. This is kind of hacky, but should work and should allow callers to configure pysox logging via logging.getLogger('sox') on the global logging object.

    opened by psobot 6
Releases(v0.4.0)
Owner
Rachel Bittner
Rachel Bittner
Powerful, simple, audio tag editor for GNU/Linux

puddletag puddletag is an audio tag editor (primarily created) for GNU/Linux similar to the Windows program, Mp3tag. Unlike most taggers for GNU/Linux

341 Dec 26, 2022
Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

3 Feb 07, 2022
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence La

Spotify 1.4k Jan 01, 2023
Code for csig audio deepfake detection

FMFCC Audio Deepfake Detection Solution This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection

BokingChen 9 Jun 04, 2022
MUSIC-AVQA, CVPR2022 (ORAL)

Audio-Visual Question Answering (AVQA) PyTorch code accompanies our CVPR 2022 paper: Learning to Answer Questions in Dynamic Audio-Visual Scenarios (O

44 Dec 23, 2022
A python wrapper for REAPER

pyreaper A python wrapper for REAPER (Robust Epoch And Pitch EstimatoR) Installation pip install pyreaper Demonstration notebnook http://nbviewer.jupy

Ryuichi Yamamoto 56 Dec 27, 2022
A GUI-based audio player with support for a large variety of formats

Miza-Player A GUI-based audio player with support for a large variety of formats, able to play from web-hosted media platforms such as YouTube, includ

Thomas Xin 3 Dec 14, 2022
DCL - An easy to use diacritic library used for diacritic and accent manipulation.

Diacritics Library This library is used for adding, and removing diacritics from strings. Getting started Start by importing the module: import dcl DC

Kreus Amredes 6 Jun 03, 2022
:speech_balloon: SpeechPy - A Library for Speech Processing and Recognition: http://speechpy.readthedocs.io/en/latest/

SpeechPy Official Project Documentation Table of Contents Documentation Which Python versions are supported Citation How to Install? Local Installatio

Amirsina Torfi 870 Dec 27, 2022
A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.

praatIO Questions? Comments? Feedback? A library for working with praat, time aligned audio transcripts, and audio files that comes with batteries inc

Tim 224 Dec 19, 2022
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022
kapre: Keras Audio Preprocessors

Kapre Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time. Tested on Python 3.6 and 3.7 Why Kapre? vs. Pre-co

Keunwoo Choi 867 Dec 29, 2022
❤️ This Is The EzilaXMusicPlayer Advaced Repo 🎵

Telegram EzilaXMusicPlayer Bot 🎵 A bot that can play music on telegram group's voice Chat ❤️ Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.7+

Sadew Jayasekara 11 Nov 12, 2022
Audio2midi - Automatic Audio-to-symbolic Arrangement

Automatic Audio-to-symbolic Arrangement This is the repository of the project "Audio-to-symbolic Arrangement via Cross-modal Music Representation Lear

Ziyu Wang 24 Dec 05, 2022
A Simple Script that will help you to Play / Change Songs with just your Voice

Auto-Spotify using Voice Recognition A Simple Script that will help you to Play / Change Songs with just your Voice Explore the docs » Table of Conten

Mehul Shah 1 Nov 21, 2021
Terminal-based music player written in Python for the best music in the world 🎵 🎧 💻

audius-terminal-player Terminal-based music player written in Python for the best music in the world 🎵 🎧 💻 Browse and listen to Audius from the com

Audius 21 Jul 23, 2022
A library for augmenting annotated audio data

muda A library for Musical Data Augmentation. muda package implements annotation-aware musical data augmentation, as described in the muda paper. The

Brian McFee 214 Nov 22, 2022
A fast MDCT implementation using SciPy and FFTs

MDCT A fast MDCT implementation using SciPy and FFTs Installation As usual pip install mdct Dependencies NumPy SciPy STFT Usage import mdct spectrum

Nils Werner 43 Sep 02, 2022
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a

Iver Jordal 1.2k Jan 07, 2023
Hide Your Secret Message in any Wave Audio File.

HiddenWave Embedding secret messages in wave audio file What is HiddenWave Hiddenwave is a python based program for simple audio steganography. You ca

TechChip 99 Dec 28, 2022