Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Last update: Apr 30, 2022

Overview

Speaker-Embeddings-Correlation-Pooling

This is the original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations" by T. Stafylakis, J. Rohdin, and L. Burget (Interspeech 2021), a result of the collaboration between Omilia - Conversational Intelligence and Brno University of Technology (BUT), which you may find here.

The code is in TensorFlow1 (TF1) but it should work with TF2 too. I only provide the code for creating the network and the required hyperparameters. The training hyperparameters we used can be found in the paper.

The code is well-commented, at least the part and (hyper-)parameters required for the correlation pooling.

Apart from the experiments provided in the paper, the code allows the user to: (a) Combine standard statistics pooling with correlation pooling, by concatenating the two pooling layers into a single one, and (b) Extract correlation pooling from outputs of all 4 internal ResNet blocks (aka stages) and concatenate them in the pooling layer.

The code can be more efficiently written using tensor-only operators. However, to facilitate research we have implemented it using lists of tensors, e.g. after merging frequency bins to frequency ranges. Despite this inefficiency, we observe no differences between correlation pooling and standard stats pooling in training speed.

Start with the file train_resnet.py, which creates the ResNet (with the pooling mechanism) and sets its parameters. All parameters are set so that you reproduce our best performing experiment (P7 in the paper).

So, try it and let us know what you'll get! Themos

Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Related tags

Overview

Speaker-Embeddings-Correlation-Pooling

Owner

Themos Stafylakis

RIDE automatically creates the package and boilerplate OOP Python node scripts as per your needs

A BERT-based reverse dictionary of Korean proverbs

Py65 65816 - Add support for the 65C816 to py65

2021语言与智能技术竞赛：机器阅读理解任务

Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

Galois is an auto code completer for code editors (or any text editor) based on OpenAI GPT-2.

Named Entity Recognition API used by TEI Publisher

Exploration of BERT-based models on twitter sentiment classifications

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

Download videos from YouTube/Twitch/Twitter right in the Windows Explorer, without installing any shady shareware apps

Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.

Code for EMNLP20 paper: "ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training"

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

SimBERT升级版（SimBERTv2）！

Transformer Based Korean Sentence Spacing Corrector