ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

Last update: Dec 02, 2022

Related tags

Text Data & NLP ConferencingSpeech2022

Overview

ConferencingSpeech 2022 challenge

This repository contains the datasets list and scripts required for the ConferencingSpeech 2022 challenge. For more details about the challenge, please see our website.

Details

baseline, this folder contains baseline system include inference model exported by inference scripts;
eval, this folder contains evaluation scripts to calculate PLCC, RMSE and SRCC;
data-sets, this folder contains training and development test data-sets provied to the participant;
- Tencent Corpus, this dataset includes about 14,000 speech chinese speech clips with simulated (e.g. codecs, packet-loss, background noise) and live conditions.
- NISQA Corpus, the NISQA Corpus includes more than 14,000 speech samples with simulated (e.g. codecs, packet-loss, background noise) and live (e.g. mobile phone, Zoom, Skype, WhatsApp) conditions.
- IU Bloomington Corpus, there are 10,000 speech signals extracted from COSINE and VOiCESdatasets, each truncated between 3 to 6 seconds long.
- PSTN Corpus, there are about 80,000 speech clips through classic public switched telephone networks, each truncated 10 seconds long.

Requirements

To install requirements install Anaconda and then use:

conda env create -f envs.yml

This will create a new environment with the name "conferencingSpeech". Activate this environment to go on:

conda activate conferencingSpeech

Code license

Apache 2.0

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

Related tags

Overview

ConferencingSpeech 2022 challenge

Details

Requirements

Code license

Owner

Library for Russian imprecise rhymes generation

Google AI 2018 BERT pytorch implementation

使用Mask LM预训练任务来预训练Bert模型。训练垂直领域语料的模型表征，提升下游任务的表现。

Conversational text Analysis using various NLP techniques

Pretrain CPM - 大规模预训练语言模型的预训练代码

Finetune gpt-2 in google colab

This repository serves as a place to document a toy attempt on how to create a generative text model in Catalan, based on GPT-2

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

DLO8012: Natural Language Processing & CSL804: Computational Lab - II

VMD Audio/Text control with natural language

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL 2021.

This library is testing the ethics of language models by using natural adversarial texts.

Korean extractive summarization. 2021 AI 텍스트 요약 온라인 해커톤 화성갈끄니까팀 코드

Contract Understanding Atticus Dataset

Long text token classification using LongFormer

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

Need: Image Search With Python

Deep learning for NLP crash course at ABBYY.