This repo contains simple to use, pretrained/training-less models for speaker diarization.

Last update: Jan 20, 2022

Related tags

Text Data & NLP pydiar

Overview

PyDiar

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Supported Models

Binary Key Speaker Modeling

Based on pyBK by Jose Patino which implements the diarization system from "The EURECOM submission to the first DIHARD Challenge" by Patino, Jose and Delgado, Héctor and Evans, Nicholas

If you have any other models you would like to see added, please open an issue.

Usage

This library seeks to provide a very basic interface. To use the Binary Key model on a file, do something like this:

import numpy as np
from pydiar.models import BinaryKeyDiarizationModel, Segment
from pydiar.util.misc import optimize_segments
from pydub import AudioSegment

INPUT_FILE = "test.wav"

sample_rate = 32000
audio = AudioSegment.from_wav(test.wav)
audio = audio.set_frame_rate(sample_rate)
audio = audio.set_channels(1)

diarization_model = BinaryKeyDiarizationModel()
segments = diarization_model.diarize(
    sample_rate, np.array(audio.get_array_of_samples())
)
optimized_segments = optimize_segments(segments)

Now optimized_segments contains a list of segments with their start, length and speaker id

Example

A simple script which reads an audio file, diarizes it and transcribes it into the WebVTT format can be found in examples/generate_webvtt.py. To use it, download a vosk model from https://alphacephei.com/vosk/models and then run the script using

poetry install
poetry run python -m examples.generate_webvtt -i PATH/TO/INPUT.wav -m PATH/TO/VOSK_MODEL

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Related tags

Overview

PyDiar

Supported Models

Usage

Example

Owner

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

🏖 Easy training and deployment of seq2seq models.

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

Simple and efficient RevNet-Library with DeepSpeed support

Malware-Related Sentence Classification

NeurIPS'21: Probabilistic Margins for Instance Reweighting in Adversarial Training (Pytorch implementation).

👑 spaCy building blocks and visualizers for Streamlit apps

Rhythm-Finder is a unsupervised ML driven python powered web-application that can find the songs that suits you.

Deep Learning for Natural Language Processing - Lectures 2021

Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG)

A demo of chinese asr

Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

edge-SR: Super-Resolution For The Masses

NeuralQA: A Usable Library for Question Answering on Large Datasets with BERT

A library that integrates huggingface transformers with the world of fastai, giving fastai devs everything they need to train, evaluate, and deploy transformer specific models.

A natural language processing model for sequential sentence classification in medical abstracts.

a CTF web challenge about making screenshots

Easy to use, state-of-the-art Neural Machine Translation for 100+ languages

An open-source NLP research library, built on PyTorch.

LCG T-TEST USING EUCLIDEAN METHOD