Parselmouth - Praat in Python, the Pythonic way Parselmouth is a Python library for the Praat software. Though other attempts have been made at portin
Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Spee
Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star
Pyrogram bot to automate streaming music in voice chats Help If you face an error, want to discuss this project or get support for it, join it's group
GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers
Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack
SpeechPy Official Project Documentation Table of Contents Documentation Which Python versions are supported Citation How to Install? Local Installatio
Nerd Dictation Offline Speech to Text for Desktop Linux. This is a utility that provides simple access speech to text for using in Linux without being
Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco
Play and Record Sound with Python This Python module provides bindings for the PortAudio library and a few convenience functions to play and record Nu
beets Beets is the media library management system for obsessive music geeks. The purpose of beets is to get your music collection right once and for
LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat
python_speech_features This library provides common speech features for ASR including MFCCs and filterbank energies. If you are not sure what MFCCs ar
Users can transcribe their favorite piano recordings to MIDI files after installation
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling Demos | Blog Post | Colab Notebook | Paper | MIDI-DDSP is a hierarchical
PianoPlayer - Automatic fingering generator for piano scores
spafe aims to simplify features extractions from mono audio files. The library can extract of the following features: BFCC, LFCC, LPC, LPCC, MFCC, IMFCC, MSRCC, NGCC, PNCC, PSRCC, PLP, RPLP, Frequency-stats etc. It also provides various filterbank modules (Mel, Bark and Gammatone filterbanks) and other spectral statistics.
Gentle Robust yet lenient forced-aligner built on Kaldi. A tool for aligning speech with text. Getting Started There are three ways to install Gentle.
MusicRecognizer Music recognition application for windows You can choose from which of the devices the recording will be made. If you choose speakers,
⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au
Exaile Exaile is a music player with a simple interface and powerful music management capabilities. Features include automatic fetching of album art,
Mopidy Mopidy is an extensible music server written in Python. Mopidy plays music from local disk, Spotify, SoundCloud, Google Play Music, and more. Y
librosa A python package for music and audio analysis. Documentation See https://librosa.org/doc/ for a complete reference manual and introductory tut
aubio aubio is a library to label music and sounds. It listens to audio signals and attempts to detect events. For instance, when a drum is hit, at wh
LPC_for_TTS Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm. 基于Levinson-Durbin
Speech Algorithms Collections
Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence.
pedalboard is a Python library for adding effects to audio. It supports a number of common audio effects out of the box, and also allows the use of VST3® and Audio Unit plugin formats for third-party effects.
Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.
Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.
pyms Minimal command-line music player written in Python. Designed with elegance and minimalism. Resizes dynamically with your terminal. Dependencies
madmom Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is i
Python implementation of STOI Implementation of the classical and extended Short Term Objective Intelligibility measures Intelligibility measure which
GNU Radio is a free & open-source software development toolkit that provides signal processing blocks to implement software radios. It can be used wit
Spotifyd An open source Spotify client running as a UNIX daemon. Spotifyd streams music just like the official client, but is more lightweight and sup
py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p
Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a
Spotify Voice Control A voice control utility for Spotify · Report Bug · Request
DDRecorder Headless全自动B站直播录播、切片、上传一体工具 感谢 FortuneDayssss/BilibiliUploader 安装指南（Windows） 在Release下载zip包解压。 修改配置文件config.json 双击运行DDRecorder.exe （这将使用co
PyAV PyAV is a Pythonic binding for the FFmpeg libraries. We aim to provide all of the power and control of the underlying library, but manage the gri
Ex Falso / Quod Libet - A Music Library / Editor / Player Quod Libet is a music management program. It provides several different ways to view your au
SoundConverter A simple sound converter application for the GNOME environment. It reads anything the GStreamer library can read, and writes Ogg Vorbis
Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi
A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple
Music player First, if you wonder about what is supposed to be a music player or what makes a music player different from a simple media player, read
Matching + Mastering = ❤️ Matchering 2.0 is a novel Containerized Web Application and Python Library for audio matching and mastering. It follows a si
Pydub Pydub lets you do stuff to audio in a way that isn't stupid. Stuff you might be looking for: Installing Pydub API Documentation Dependencies Pla
Mousai is a simple application that can identify song like Shazam. It saves the artist, album, and title of the identified song in a JSON file.
Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The
Program that lowers volume when you die and get flashed in CS:GO. It aims to lower the chance of hearing damage by reducing overall sound exposure. Uses game state integration. Anti-cheat safe.
praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.
Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G
puddletag puddletag is an audio tag editor (primarily created) for GNU/Linux similar to the Windows program, Mp3tag. Unlike most taggers for GNU/Linux
tinytag tinytag is a library for reading music meta data of MP3, OGG, OPUS, MP4, M4A, FLAC, WMA and Wave files with python Install pip install tinytag