This repo contains simple to use, pretrained/training-less models for speaker diarization.

Last update: Jan 20, 2022

Related tags

Text Data & NLP pydiar

Overview

PyDiar

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Supported Models

Binary Key Speaker Modeling

Based on pyBK by Jose Patino which implements the diarization system from "The EURECOM submission to the first DIHARD Challenge" by Patino, Jose and Delgado, Héctor and Evans, Nicholas

If you have any other models you would like to see added, please open an issue.

Usage

This library seeks to provide a very basic interface. To use the Binary Key model on a file, do something like this:

import numpy as np
from pydiar.models import BinaryKeyDiarizationModel, Segment
from pydiar.util.misc import optimize_segments
from pydub import AudioSegment

INPUT_FILE = "test.wav"

sample_rate = 32000
audio = AudioSegment.from_wav(test.wav)
audio = audio.set_frame_rate(sample_rate)
audio = audio.set_channels(1)

diarization_model = BinaryKeyDiarizationModel()
segments = diarization_model.diarize(
    sample_rate, np.array(audio.get_array_of_samples())
)
optimized_segments = optimize_segments(segments)

Now optimized_segments contains a list of segments with their start, length and speaker id

Example

A simple script which reads an audio file, diarizes it and transcribes it into the WebVTT format can be found in examples/generate_webvtt.py. To use it, download a vosk model from https://alphacephei.com/vosk/models and then run the script using

poetry install
poetry run python -m examples.generate_webvtt -i PATH/TO/INPUT.wav -m PATH/TO/VOSK_MODEL

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Related tags

Overview

PyDiar

Supported Models

Usage

Example

Owner

SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

NewsMTSC: (Multi-)Target-dependent Sentiment Classification in News Articles

A simple Flask site that allows users to create, update, and delete posts in a database, as well as perform basic NLP tasks on the posts.

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

Open source code for AlphaFold.

使用pytorch+transformers复现了SimCSE论文中的有监督训练和无监督训练方法

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

🦆 Contextually-keyed word vectors

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

texlive expressions for documents

NLP-based analysis of poor Chinese movie reviews on Douban

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

A Fast Command Analyser based on Dict and Pydantic

Source code for CsiNet and CRNet using Fully Connected Layer-Shared feedback architecture.

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.

Deep Learning for Natural Language Processing - Lectures 2021

Speech to text streamlit app