Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Last update: Apr 30, 2022

Overview

Speaker-Embeddings-Correlation-Pooling

This is the original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations" by T. Stafylakis, J. Rohdin, and L. Burget (Interspeech 2021), a result of the collaboration between Omilia - Conversational Intelligence and Brno University of Technology (BUT), which you may find here.

The code is in TensorFlow1 (TF1) but it should work with TF2 too. I only provide the code for creating the network and the required hyperparameters. The training hyperparameters we used can be found in the paper.

The code is well-commented, at least the part and (hyper-)parameters required for the correlation pooling.

Apart from the experiments provided in the paper, the code allows the user to: (a) Combine standard statistics pooling with correlation pooling, by concatenating the two pooling layers into a single one, and (b) Extract correlation pooling from outputs of all 4 internal ResNet blocks (aka stages) and concatenate them in the pooling layer.

The code can be more efficiently written using tensor-only operators. However, to facilitate research we have implemented it using lists of tensors, e.g. after merging frequency bins to frequency ranges. Despite this inefficiency, we observe no differences between correlation pooling and standard stats pooling in training speed.

Start with the file train_resnet.py, which creates the ResNet (with the pooling mechanism) and sets its parameters. All parameters are set so that you reproduce our best performing experiment (P7 in the paper).

So, try it and let us know what you'll get! Themos

Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Related tags

Overview

Speaker-Embeddings-Correlation-Pooling

Owner

Themos Stafylakis

Python module (C extension and plain python) implementing Aho-Corasick algorithm

We have built a Voice based Personal Assistant for people to access files hands free in their device using natural language processing.

CLIPfa: Connecting Farsi Text and Images

COVID-19 Related NLP Papers

Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

Semantic search for quotes.

Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.

Predict an emoji that is associated with a text

Codename generator using WordNet parts of speech database

A Python script which randomly chooses and prints a file from a directory.

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

Machine learning models from Singapore's NLP research community

Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

The training code for the 4th place model at MDX 2021 leaderboard A.

Lumped-element impedance calculator and frequency-domain plotter.