Python library for audio and music analysis

Overview

librosa

A python package for music and audio analysis.

PyPI Anaconda-Server Badge License DOI

Build Status Build status Coverage Status

Documentation

See https://librosa.org/doc/ for a complete reference manual and introductory tutorials.

Installation

The latest stable release is available on PyPI, and you can install it by saying

pip install librosa

Anaconda users can install using conda-forge:

conda install -c conda-forge librosa

To build librosa from source, say python setup.py build. Then, to install librosa, say python setup.py install. If all went well, you should be able to execute the demo scripts under examples/ (OS X users should follow the installation guide given below).

Alternatively, you can download or clone the repository and use pip to handle dependencies:

unzip librosa.zip
pip install -e librosa

or

git clone https://github.com/librosa/librosa.git
pip install -e librosa

By calling pip list you should see librosa now as an installed package:

librosa (0.x.x, /path/to/librosa)

Hints for the Installation

librosa uses soundfile and audioread to load audio files. Note that soundfile does not currently support MP3, which will cause librosa to fall back on the audioread library.

soundfile

If you're using conda to install librosa, then most audio coding dependencies (except MP3) will be handled automatically.

If you're using pip on a Linux environment, you may need to install libsndfile manually. Please refer to the SoundFile installation documentation for details.

audioread and MP3 support

To fuel audioread with more audio-decoding power (e.g., for reading MP3 files), you may need to install either ffmpeg or GStreamer.

Note that on some platforms, audioread needs at least one of the programs to work properly.

If you are using Anaconda, install ffmpeg by calling

conda install -c conda-forge ffmpeg

If you are not using Anaconda, here are some common commands for different operating systems:

  • Linux (apt-get): apt-get install ffmpeg or apt-get install gstreamer1.0-plugins-base gstreamer1.0-plugins-ugly
  • Linux (yum): yum install ffmpeg or yum install gstreamer1.0-plugins-base gstreamer1.0-plugins-ugly
  • Mac: brew install ffmpeg or brew install gstreamer
  • Windows: download binaries from this website

For GStreamer, you also need to install the Python bindings with

pip install pygobject

Discussion

Please direct non-development questions and discussion topics to our web forum at https://groups.google.com/forum/#!forum/librosa

Citing

If you want to cite librosa in a scholarly work, there are two ways to do it.

  • If you are using the library for your work, for the sake of reproducibility, please cite the version you used as indexed at Zenodo:

    DOI

  • If you wish to cite librosa for its design, motivation etc., please cite the paper published at SciPy 2015:

    McFee, Brian, Colin Raffel, Dawen Liang, Daniel PW Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. "librosa: Audio and music signal analysis in python." In Proceedings of the 14th python in science conference, pp. 18-25. 2015.

Comments
  • Multirate Filterbank from Chroma Toolbox

    Multirate Filterbank from Chroma Toolbox

    Hi everyone,

    following #394, this is a first start on integrating the multirate filterbank which is used in the chroma toolbox.

    I have another notebook where I show the whole processing chain: https://github.com/stefan-balke/mpa-exc/blob/master/02_fourier_transform/pitch_filterbank.ipynb

    As for implementation, it could live as another "spectral representation" in spectrum.py, although its more a CQT–but I guess we leave cqt.py reserved for the actual transform. Parameter-wise, this one is pretty much fixed, although one could add a parameter for "detuning", meaning another reference frequency than 440 Hz @ A4.

    But that's up for discussion.

    As next steps, I would add some unit tests comparing the filter coefficients to the Chroma Toolbox and a function which actually calls this filterbank...should be straight-forward!


    This change is Reviewable

    enhancement functionality 
    opened by stefan-balke 89
  • DTW

    DTW

    Hey all,

    I mainly took @craffel dijtw source code and merged it with mine. As dicussed in #298, we want the following features:

    • [x] Arbitrary step sizes
    • [x] Additive or multiplicative local weights for the steps
    • [x] Subsequence (so it can be used for matching)
    • [x] Global path constraints (e.g. Sakoe-Chiba band etc.)
    • [x] make numba optional (cf. Brian's comment)
    • [x] test backtracking explicitly
    • [x] plot D + wp (in the examples)
    • ~~Gullying~~

    After finishing the features (which are in dijtw), we need more tests and some final notebook with benchmarking the implementation against "vanilla-vanilla" dtw.


    This change is Reviewable

    functionality 
    opened by stefan-balke 84
  • The big Multi-Channel PR

    The big Multi-Channel PR

    Reference Issue

    This PR begins the work of #1130, extending librosa to support multi-channel wherever possible.

    What does this implement/fix? Explain your changes.

    This branch will serve as the target for smaller pull requests implementing multi-channel support throughout the package.

    As a first step, I've implemented multi-channel stft. It's numerically equivalent to our previous implementation, and comparably efficient for single-channel inputs. For multi-channel inputs, it's not as efficient as I think is possible, but optimization will take a bit of sleuthing.

    Any other comments?

    This PR will not be merged until #1130 is complete.


    Progress tracker

    beat

    • [x] tempo : yes, one tempo per channel?
    • [x] plp : yes

    decompose

    • [x] decompose : maybe, could require some clever reshaping. not obvious how to do it.
    • [x] hpss : yes
    • [x] nn_filter : features: probably, self-similarity: ~probably not~ no

    effects

    • [x] hpss : yes
    • [x] harmonic : yes
    • [x] percussive : yes
    • [x] time_stretch : yes
    • [x] pitch_shift : yes (issue #1085)
    • [x] preemphasis: yes
    • [x] deemphasis: yes
    • [x] trim
    • [x] split
    • [x] remix : yes update: zc alignment won't be feasible here, but we can only support zc alignment for mono signals.

    filters

    ~- [ ] diagonal_filter: could require some thought~ - not necessary; this is mainly used for building smoothing filters for self-similarity matrices, which in the multi-channel case, would still operate independently across channels. Keeping diagonal filters 2d seems appropriate.

    onset

    • [x] onset_strength: yes
    • [x] onset_strength_multi: yes ~- [ ] onset_backtrack: probably not~ no

    segment

    • [x] cross_similarity : and
    • [x] recurrence_matrix : yes-ish: channels dimensions are flattened so output integrates over channels ~- [ ] recurrence_to_lag~ : no ~- [ ] lag_to_recurrence~ : no ~- [ ] timelag_filter : maybe~ nothing to do here, as it depends entirely on lag<->recurrence conversion
    • [x] path_enhance: yes

    sequence

    • [x] dtw : probably best to treat similarly to segment modules: flatten channel dimensions ~- [ ] rqa : not really; ragged output. otherwise, treat like dtw~ rqa takes a square matrix as input; nothing to do here
    • [x] viterbi : yes
    • [x] viterbi_discriminative : yes
    • [x] viterbi_binary : yes

    core

    • [x] stream: n/a, I think?
    • [x] to_mono: yes
    • [x] resample: yes, n/a
    • [x] get_duration: yes
    • [x] stft: yes
    • [x] istft: yes
    • [x] magphase : yes
    • [x] griffinlim: yes
    • [x] zero_crossings: yes
    • [x] autocorrelate: yes
    • [x] fmt: yes
    • [x] pcen: yes
    • [x] cqt: yes
    • [x] vqt: yes
    • [x] pseudo_cqt: yes
    • [x] hybrid_cqt: yes
    • [x] icqt: yes
    • [x] griffinlim_cqt: yes
    • [x] reassigned_spectrogram: yes
    • [x] phase_vocoder: yes
    • [x] interp_harmonics: yes ~- [ ] harmonics_1d: yes~ -> remove, fold functionality into interp_harmonics ~- [ ] harmonics_2d: yes~ -> remove
    • [x] salience: yes
    • [x] iirt: yes
    • [x] lpc: yes
    • [x] yin: yes
    • [x] piptrack: yes
    • [x] estimate_tuning: yes ~should produce one tuning estimate per channel~ one tuning estimate across all channels ~- [ ] pitch_tuning: yes~ no, this works on a set of frequencies, not the signal directly. Nothing to do here.
    • [x] pyin: yes
    • [x] clicks: yes, but only if a multichannel click sample is provided ~- [ ] tone~: no

    feature

    • [x] chroma_stft: yes
    • [x] chroma_cqt: yes
    • [x] chroma_cens: yes
    • [x] melspectrogram: yes
    • [x] mfcc:yes
    • [x] rms: yes
    • [x] spectral_centroid: yes
    • [x] spectral_bandwidth: yes
    • [x] spectral_flatness: yes
    • [x] spectral_rolloff: yes
    • [x] poly_features : yes
    • [x] tonnetz : yes
    • [x] zero_crossing_rate : yes
    • [x] delta : yes
    • [x] stack_memory : yes
    • [x] tempogram : yes
    • [x] fourier_tempogram : yes
    • [x] mel_to_stft: yes
    • [x] mel_to_audio : yes
    • [x] mfcc_to_mel :yes
    • [x] mfcc_to_audio: yes

    util

    • [x] frame: yes
    • [x] pad_center: yes
    • [x] fix_length: yes
    • [x] softmask: yes
    • [x] normalize: yes
    • [x] localmax: yes
    • [x] localmin: yes
    • [x] valid_audio: yes
    • [x] stack: yes ~- [ ] fix_frames: probably not / nothing to do?~ ~- [ ] index_to_slice~: no - this is an inherently one-dimensional operation
    • [x] sync: yes, maybe already works ~- [ ] axis_sort: ~maybe, not clear what it would mean~ probably not; possible rename to axis_sort2d~ no ~- [ ] shear: maybe; dense only? but probably not~ no
    • [x] nnls: yes

    Maintenance tasks

    • [x] audit docstrings of all functions extended for multichannel
    • [x] expand_to helper function
    • [x] relax valid_audio shape checks
    • [x] audit for use of valid_audio
    enhancement functionality API change 
    opened by bmcfee 65
  • YIN and pYIN

    YIN and pYIN

    Reference Issue

    Fixes #527

    What does this implement/fix? Explain your changes.

    This pull request implements the YIN and pYIN algorithms for pitch tracking. The YIN function is based on @lostanlen's PR #974 (with a few modifications) while the pYIN function is based on this paper.

    Any other comments?

    Both functions work well but some refactoring is definately needed. I compared the outputs of pYIN to the official vamp plugin and the results are comparable. I haven't added any special treatment of low amplitude frames yet so silent frames with periodic noice occasionally give wrong results.

    Also, note that I haven't used the librosa.core.autocorrelate function. The librosa.core.autocorrelate computes equation (2) in the YIN paper which seems to perform worse than the one used here (which computes equation (1)). I tried scaling by the window size (as suggested in the paper) but it didn't improve things much.

    I haven't implemented any tests yet so it would be good to get some guidance there.

    functionality 
    opened by bshall 59
  • DTW

    DTW

    Hey there,

    I recently did a DTW implementation based on Meinard's book with some Cython speed up for the dynamic programming. As this is not yet reflected in librosa, I wondered if we could do a module for music syncronization.

    Best Stefan

    functionality discussion 
    opened by stefan-balke 45
  • CQT length scale normalization

    CQT length scale normalization

    This PR implements #412, and a couple of bug-fixes. Summary of contents:

    • Added a scale boolean option to all CQT methods. If enabled, CQT bands are normalized by sqrt(n[i]) where n[i] is the length of the ith filter. This is analogous to norm='ortho' mode in np.fft.fft.
    • Early downsampling is less aggressive by one octave. Previously, it was too aggressive, and the top-end of each octave was close to nyquist, which resulted in attenuation.
    • Magnitude is now continuous and approximately equal for full, hybrid, and pseudo CQT. This fixes some of the unresolved continuity errors noted in #347.
    • Expanded the CQT unit tests to include a white noise continuity test.

    Using scale=True means that white noise input will look like white noise output in CQT space (flat spectrum). With scale=False (current default), white noise in looks like 1/sqrt(f).

    scale=True makes an impulse look like sqrt(f) as opposed to constant for scale=False.


    This change is Reviewable

    bug enhancement functionality API change 
    opened by bmcfee 41
  • [Discussion] CQT, VQT, energy preservation, and frequency band alignment

    [Discussion] CQT, VQT, energy preservation, and frequency band alignment

    I wanted to make a separate issue to consolidate the discussion around a number of related points that have popped up in the implementation of VQT #1018, inverse CQT #165, and so on. @lostanlen and I have discussed these things in various places (often offline), but it would be helpful to have a more permanent record.

    Frequency band definitions

    In the CQT implementation, we follow the SK2010 definition of the frequency bands, which are essentially left-aligned: [f, a * f] (for some constant a). This works well enough, but it makes some calculations awkward, eg in the VQT. Later on in the CQT toolbox, the definition shifted to a centered representation, eg [f / sqrt(a), f * sqrt(a)] (again for some constant a), so that the frequency of each filter is centered within the band.

    Some questions:

    • Should we add support for centered frequency bands? I don't think it would change much, except possibly some boundary effects at the top end of octaves.
    • Should we make centering the default? Should we even bother with left-aligned bands anymore?

    Energy preservation

    Our CQT implementation has historically had a bunch of headaches around normalization and energy preservation. Of course we can't exactly preserve energy with a lossy frequency representation, but our current cqt/icqt round trip gets pretty close. However, other feature representations (notably Mel) are not so forgiving, and it would be nice if we could strive for consistency here.

    discussion 
    opened by bmcfee 37
  • Default sample rate for librosa.core.load is not the native sample rate

    Default sample rate for librosa.core.load is not the native sample rate

    Currently, the default sample rate is 22050 Hz when loading an audio file. In order to use the native sample rate, sr must be set to None. I often forget to set this to None when I want the native sample rate, and I have noticed many students do the same.

    Have others experienced confusion over this? Should this change in a later release so that the native sample rate is the default?

    discussion IO 
    opened by mcartwright 31
  • Add type annotations

    Add type annotations

    I would like to have code completions and type checking for code that uses librosa. Adding type annotations will significantly improve the coding experience for users who use type checkers / typing-based completion engines.

    Currently some type checkers like Pylance are able to infer some types, however this is often wrong and leads to false positives (errors on working code) or false negatives (missing errors on invalid code). Other type checkers like Mypy cannot do anything with code that uses librosa, and must treat it as 'Any'.

    functionality 
    opened by matangover 30
  • RFC: more example tracks?

    RFC: more example tracks?

    It's been mentioned on numerous occasions that our current example track, catchy as it may be, is not great for demonstrating many of the functions of librosa. In offline discussions, @lostanlen and I have talked about extending the included audio content to have several tracks, which could be used to demonstrate different functionality. So I want to kick this out to the community: what do you want to see in our examples? By this I mean: please recommend specific recordings.

    To prevent this discussion from becoming an infinite bike shed, I'm going to lay down some ground rules for inclusion:

    1. Content must be CC-licensed or public domain.
    2. Total content should not bloat the package too much; I think 10MB is a reasonable upper bound, and this functionally limits us to between 5 and 10 total recordings.
    3. No single track should be too long. It's okay to have some very short examples.
    4. The total collection should be diverse in terms of style, instrumentation, polyphony, etc. If possible, I'd like to include some non-western recordings as well.
    5. Any lyrical content should not be offensive, for some reasonable definition of offensive which is compatible with our CoC. I don't expect any problems here, but I'll reserve executive privilege here to veto anything that could be problematic.
    6. Familiarity would be a bonus, for making the examples and documentation more immediately accessible.

    With all of that out of the way, let's talk about things not currently demonstrated by our current example. We don't have to hit all of these, and I'm sure I'm missing some, but we should aim to hit most of them.

    • Monophonic audio: we should have at least one solo instrument recording that can be used to demonstrate things like pitch tracking. Maybe a raga or makam could be good here? This could also be good for demonstrating onsets.
    • Interesting harmony: the current example is pretty boring, harmonically speaking. Maybe a jazz recording would be appropriate here? Maybe something with some key changes as well.
    • Non-percussive rhythmic elements: something classical (strings) would be nice to have for demonstrating onset and beat tracking with soft attacks.
    • Different time signatures: examples with 4/4, 3/4, and maybe a 5/4 or 7/8 would be nice for demonstrating some of the rhythmic features (tempogram, fmt)
    • Vocals and instrumentals: we should have at least one track with vocals.
    • Non-musical audio: do we need/want this? Speech? Environmental sound? Librosa gets used for these things, so it might be worth considering their inclusion.

    Some other discussions that we can have around this:

    • Should we have a multi-track / stem set as one of the examples?
    • What kind of genre coverage should we strive for?
    • How does this issue interact with #641 (non-western systems in display)? Is #641 a pre-requisite for including non-western examples? (I'd argue that it should be, and that this would be a good motivating factor for finally doing it.)

    Finally, the examples gallery already includes a few candidate options here. We can take some, all, or none of these, but whatever we decide on including should serve as plausible replacements for them and the example notebooks should be revised afterward.

    discussion management 
    opened by bmcfee 30
  • Added optional pyFFTW backend

    Added optional pyFFTW backend

    As proposed in https://github.com/librosa/librosa/issues/353 scipy.fftpack is a terrible bottleneck and supporting FFTW would be immensely useful. This pull request adds an optional wrapper for FFTW. If pyFFTW is installed librosa will prefer that over scipy.fftpack. Everything will work as normal if the user is missing pyFFTW.

    @bmcfee, thoughts? The speedup is substantial, and as librosa depends on STFT all over the place (even for rmse) I feel this is a necessary addition, and a lot nicer than having users monkey patch outside of librosa.


    This change is Reviewable

    enhancement 
    opened by carlthome 30
  • Drop support for threshold=None in zero_crossings

    Drop support for threshold=None in zero_crossings

    Is your feature request related to a problem? Please describe. This came up in implementing #1632 - the threshold parameter in zero_crossings is currently optional, but it doesn't need to be. If threshold is None, it is converted to 0 automatically.

    Describe the solution you'd like Threshold=0 is equivalent behavior, and dropping the None support would simplify some type annotations and a bit of the implementation.

    API change 
    opened by bmcfee 1
  • Getting an Error: ValueError: Input signal length=0 is too small to resample from 44100->22050 ,  while loading an mp3 file by librosa.load()

    Getting an Error: ValueError: Input signal length=0 is too small to resample from 44100->22050 , while loading an mp3 file by librosa.load()

    BEFORE POSTING A BUG REPORT Please look through existing issues (both open and closed) to see if it's already been reported or fixed!

    Describe the bug A clear and concise description of what the bug is. Recently I am working with some audio data . But the issue I am facing that while I am trying to load the audio file by librosa.load().

    To Reproduce

    Expected behavior I am guessing that this error is may be happening as the x which should be 1-D numpy array , it is empty in this case. but why?? When I am playing the sound by Ipython ,It is working fine . i4 i2 i3

    Screenshots If applicable, add screenshots to help explain your problem.

    Software versions*

    Windows-10-10.0.22621-SP0 Python 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:26:21) [MSC v.1929 64 bit (AMD64)] NumPy 1.22.3 SciPy 1.8.0 librosa 0.9.2 INSTALLED VERSIONS

    python: 3.9.6 (tags/v3.9.6:db3ff76, Jun 28 2021, 15:26:21) [MSC v.1929 64 bit (AMD64)]

    librosa: 0.9.2

    audioread: 3.0.0 numpy: 1.22.3 scipy: 1.8.0 sklearn: 1.0.2 joblib: 1.1.0 decorator: 5.1.1 soundfile: 0.11.0 resampy: 0.4.2 numba: 0.56.4 pooch: v1.6.0 packaging: 21.3

    numpydoc: None sphinx: None sphinx_rtd_theme: None sphinx_multiversion: None sphinx_gallery: None mir_eval: None ipython: None sphinxcontrib-svg2pdfconverter: None pytest: None pytest-mpl: None pytest-cov: None matplotlib: 3.5.1 samplerate: None soxr: None contextlib2: None presets: None Additional context Add any other context about the problem here. Please help me to fix this issue . As the x is getting store an empty numpy 1-D array I cann't able to proceed further .I am not getting it's solution no where .

    Upstream/dependency bug 
    opened by Soumendraprasad 8
  • Note to update for scipy 1.10

    Note to update for scipy 1.10

    This is just a general place-holder note to look through the scipy 1.10 release notes for functionality that we should make use of (if not depend on).

    There appear to be relevant additions to the interpolate and signal modules, but we should investigate closely before the 0.10 release over here.

    management 
    opened by bmcfee 2
  • Librosa import just stops and does nothing

    Librosa import just stops and does nothing

    I import librosa in my program using the following line import librosa

    and when I run my program it just stops there and does nothing, there is no error message so I can't give you any

    here you see a screen shot of the beginning of my program and the output window and as you can see the import librosa stop everything p

    if you need more info just ask them

    thank you

    question Upstream/dependency bug 
    opened by DorianCreuze 4
  • Missing information in documentation of librosa.feature.chroma_cqt

    Missing information in documentation of librosa.feature.chroma_cqt

    In the documentation of librosa.feature.chroma_cqt the tuning parameter is described as follows:

    tuning : float
        Deviation (in fractions of a CQT bin) from A440 tuning
    

    This left me somewhat wondering about the automatic tuning estimation offered in librosa.cqt. After inspecting the code it seems that this parameter is only used for passing it to the librosa.cqt or librosa.hybrid_cqt functions. There this parameter is described as follows:

    tuning : None or float
            Tuning offset in fractions of a bin.
            If ``None``, tuning will be automatically estimated from the signal.
            The minimum frequency of the resulting CQT will be modified to
            ``fmin * 2**(tuning / bins_per_octave)``.
    

    In addition, the default parameter for tuning in chroma_cqt is None while the default parameter for it in cqt and hybrid_cqt is 0.0.

    From my point of view, the description of the tuning parameter in the chroma_cqt function should be the same and therefore be extend with the information about what None actually means. I'm however not sure about the default values, since it would make sense to me to make those equal, but this may introduce some problems with old code.

    documentation 
    opened by si-timme 1
Releases(0.9.2)
  • 0.9.2(Jun 27, 2022)

    What's Changed

    • updated showversions to match setup.cfg. fixes #1455 by @bmcfee in https://github.com/librosa/librosa/pull/1457
    • switched submodule url to https in advance of git:// deprecation by @bmcfee in https://github.com/librosa/librosa/pull/1461
    • Fix db_to_amplitude docs by @i-aki-y in https://github.com/librosa/librosa/pull/1469
    • Improved read me by @Asmitha-K in https://github.com/librosa/librosa/pull/1473
    • Expanded documentation for viterbi_discriminative by @bmcfee in https://github.com/librosa/librosa/pull/1475
    • Allow preconstructed audioread objects in load by @bmcfee in https://github.com/librosa/librosa/pull/1477
    • Improved edits to the README.md file by @cr2007 in https://github.com/librosa/librosa/pull/1479
    • Fix function name in docs of STFT by @LorenzNickel in https://github.com/librosa/librosa/pull/1487
    • pinned sphinx version to 4.5 for doc site by @bmcfee in https://github.com/librosa/librosa/pull/1491
    • bug-fix for multichannel splitting by @bmcfee in https://github.com/librosa/librosa/pull/1493
    • Remove redundant article in CONTRIBUTING.md by @LorenzNickel in https://github.com/librosa/librosa/pull/1507
    • Fix typo in warning in inverse.py by @LorenzNickel in https://github.com/librosa/librosa/pull/1508
    • Fix multiple typos in spectral.py by @LorenzNickel in https://github.com/librosa/librosa/pull/1509
    • Speed up magphase by @futurulus in https://github.com/librosa/librosa/pull/1504
    • Escape special characters in the directory paths when calling find_files() by @Xiao-Ming in https://github.com/librosa/librosa/pull/1511
    • Documentation updates ahead of 0.9.2 release by @bmcfee in https://github.com/librosa/librosa/pull/1513

    New Contributors

    • @i-aki-y made their first contribution in https://github.com/librosa/librosa/pull/1469
    • @Asmitha-K made their first contribution in https://github.com/librosa/librosa/pull/1473
    • @cr2007 made their first contribution in https://github.com/librosa/librosa/pull/1479
    • @LorenzNickel made their first contribution in https://github.com/librosa/librosa/pull/1487
    • @futurulus made their first contribution in https://github.com/librosa/librosa/pull/1504
    • @Xiao-Ming made their first contribution in https://github.com/librosa/librosa/pull/1511

    Full Changelog: https://github.com/librosa/librosa/compare/0.9.1...0.9.2

    Source code(tar.gz)
    Source code(zip)
  • 0.9.1(Feb 15, 2022)

    This minor release restores API compatibility for functions with positional arguments.

    See https://librosa.org/doc/latest/changelog.html for details.

    Source code(tar.gz)
    Source code(zip)
  • 0.9.0(Feb 7, 2022)

    This release introduces multichannel support and substantial number of bug fixes and enhancements.

    See https://librosa.org/doc/main/changelog.html#v0-9-0 for a full list of changes.

    Source code(tar.gz)
    Source code(zip)
  • 0.9.0rc0(Jan 31, 2022)

  • 0.8.1(May 26, 2021)

    This is primarily a bug-fix and maintenance release.

    New features include interactive waveform visualization, signal de-emphasis effect, and expanded resampling modes.

    A full list of changes can be found at https://librosa.org/doc/main/changelog.html#v0-8-1

    Source code(tar.gz)
    Source code(zip)
  • 0.8.1rc2(May 25, 2021)

  • 0.8.1rc1(May 23, 2021)

    First release candidate for 0.8.1.

    This is primarily a bug-fix and maintenance release. A full list of changes can be found at https://librosa.org/doc/main/changelog.html#v0-8-1

    Source code(tar.gz)
    Source code(zip)
  • 0.8.0(Jul 22, 2020)

    First release of the 0.8 series.

    Major changes include:

    • Removed support for Python 3.5 and earlier.
    • Added pitch tracking (yin and pyin)
    • Variable-Q transform
    • Hindustani and Carnatic notation support
    • Expanded collection of example tracks
    • Numerous speedups and bugfixes
    Source code(tar.gz)
    Source code(zip)
  • 0.7.2(Jan 13, 2020)

    This is primarily a bug-fix release, and most likely the last release in the 0.7 series.

    It includes fixes for errors in dynamic time warping (DTW) and RMS energy calculation, and several corrections to the documentation.

    Inverse-liftering is now supported in MFCC inversion, and an implementation of mu-law companding has been added.

    Please refer to the documentation for a full list of changes.

    Source code(tar.gz)
    Source code(zip)
  • 0.7.1(Oct 9, 2019)

    This minor revision includes mainly bug fixes, but there are a few new features as well:

    • Griffin-Lim for constant-Q spectra
    • Multi-dimensional in-place framing
    • Enhanced compatibility with HTK for MFCC generation
    • Time-frequency reassigned spectrograms

    Please refer to the documentation for a full list of changes.

    Source code(tar.gz)
    Source code(zip)
  • 0.7.0(Jul 8, 2019)

    First release of the 0.7 series.

    Major changes include streaming mode, feature inversion, faster decoding, more efficient spectral transformations, and numerous API enhancements.

    Source code(tar.gz)
    Source code(zip)
  • 0.7.0rc1(Jul 1, 2019)

    First release candidate of the 0.7 series.

    Major changes include streaming mode, faster decoding, more efficient spectral transformations, and numerous API enhancements.

    Source code(tar.gz)
    Source code(zip)
  • 0.6.3(Feb 13, 2019)

  • 0.6.2(Aug 9, 2018)

  • 0.6.1(May 24, 2018)

    0.6.1 final release. This contains no substantial changes from 0.6.1rc0.

    The major changes from 0.6.0 include:

    • new module librosa.sequence for Viterbi decoding
    • Per-channel energy normalization (librosa.pcen())

    As well as numerous bug-fixes and acceleration enhancements.

    Source code(tar.gz)
    Source code(zip)
  • 0.6.1rc0(May 22, 2018)

    First release candidate for 0.6.1.

    This is primarily a bugfix release, though two new features have been added: per-channel energy normalization (pcen) and Viterbi decoding (librosa.sequence module).

    Source code(tar.gz)
    Source code(zip)
  • 0.6.0(Feb 17, 2018)

  • 0.6.0rc1(Feb 13, 2018)

  • 0.6.0rc0(Feb 10, 2018)

    First release candidate for 0.6.

    This is a major revision, and contains numerous bugfixes and some small API changes that break backward compatibility with the 0.5 series. A full changelog is provided in the documentation.

    Source code(tar.gz)
    Source code(zip)
  • 0.5.1(May 8, 2017)

  • 0.5.0rc0(Feb 11, 2017)

  • 0.4.3rc0(May 15, 2016)

  • 0.4.2(Feb 20, 2016)

  • 0.4.1(Oct 17, 2015)

    This minor revision expands the rhythm analysis functionality, and fixes several small bugs.

    It is also the first release to officially support Python 3.5.

    For a complete list of changes, refer to the CHANGELOG.

    Source code(tar.gz)
    Source code(zip)
  • 0.4.1rc0(Oct 14, 2015)

  • 0.4.0rc2(May 23, 2015)

  • 0.4.0rc1(Mar 4, 2015)

    There are still a few issues to clean up with the 0.4 milestone, but these mainly relate to testing.

    This rc should be essentially feature complete.

    Source code(tar.gz)
    Source code(zip)
Owner
librosa
Python tools for music and audio analysis
librosa
Jarvis From Basic to Advance - make a voice assistant similar to JARVIS (in iron man movie)

JARVIS (Basic to Advance) This was my attempt to make a voice assistant similar to JARVIS (in iron man movie) Let's be honest, it's not as intelligent

codesempai 17 Dec 25, 2022
Python module for handling audio metadata

Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The

Quod Libet 1.1k Dec 31, 2022
music library manager and MusicBrainz tagger

beets Beets is the media library management system for obsessive music geeks. The purpose of beets is to get your music collection right once and for

beetbox 11.3k Dec 31, 2022
A fast MDCT implementation using SciPy and FFTs

MDCT A fast MDCT implementation using SciPy and FFTs Installation As usual pip install mdct Dependencies NumPy SciPy STFT Usage import mdct spectrum

Nils Werner 43 Sep 02, 2022
Carnatic Notes Predictor for audio files

Carnatic Notes Predictor for audio files Link for live application: https://share.streamlit.io/pradeepak1/carnatic-notes-predictor-for-audio-files/mai

1 Nov 06, 2021
The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases.

The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases. The cross-section of the throat is less than the cross-section of the inlet pi

Shankar Mahadevan L 1 Dec 03, 2021
Real-time audio visualizations (spectrum, spectrogram, etc.)

Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco

Timothée Lecomte 700 Dec 31, 2022
voice assistant made with python that search for covid19 data(like total cases, deaths and etc) in a specific country

covid19-voice-assistant voice assistant made with python that search for covid19 data(like total cases, deaths and etc) in a specific country installi

Miguel 2 Dec 05, 2021
無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

Hiroshiba 0 Aug 29, 2022
Library for working with sound files of the format: .ogg, .mp3, .wav

Library for working with sound files of the format: .ogg, .mp3, .wav. By work is meant - playing sound files in a straight line and in the background, obtaining information about the sound file (auth

Romanin 2 Dec 15, 2022
Musillow is a music recommender app that finds songs similar to your favourites.

MUSILLOW The music recommender app Check it out now!!! View Demo · Report Bug · Request Feature About The App Musillow is a music recommender app that

3 Feb 03, 2022
Reading list for research topics in sound event detection

Sound event detection aims at processing the continuous acoustic signal and converting it into symbolic descriptions of the corresponding sound events present at the auditory scene.

Soham 64 Jan 05, 2023
All-In-One Digital Audio Workstation and Plugin Suite

How to install Windows Mac OS X Fedora Ubuntu How to Build Debian and Ubuntu Fedora All Other Linux Distros Mac OS X Windows What is MusiKernel? MusiK

j3ffhubb 111 Sep 21, 2021
SomaFM Plugin for Kodi

SomaFM XBMC Plugin This description is a bit outdated. You can simply install this addon by browsing the official repositories from within Kodi. Insta

7 Jan 21, 2022
Python I/O for STEM audio files

stempeg = stems + ffmpeg Python package to read and write STEM audio files. Technically, stems are audio containers that combine multiple audio stream

Fabian-Robert Stöter 72 Dec 23, 2022
Voice helper on russian

Voice helper on russian

KreO 1 Jun 30, 2022
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

Juan F. Montesinos 12 Oct 22, 2022
Spotify Song Recommendation Program

Spotify-Song-Recommendation-Program Made by Esra Nur Özüm Written in Python The aim of this project was to build a recommendation system that recommen

esra nur özüm 1 Jun 30, 2022
Open Sound Strip, Sequence or Record in Audacity

Audacity Tools For Blender Sound editing in Blender Video Sequence Editor with Audacity integrated. Send/receive the full edited sequence or single st

64 Dec 31, 2022
Dataset and baseline code for the VocalSound dataset (ICASSP2022).

VocalSound: A Dataset for Improving Human Vocal Sounds Recognition Introduction Citing Download VocalSound Dataset Details Baseline Experiment Contact

Yuan Gong 58 Jan 03, 2023