Speech Recognition for Uyghur using Speech transformer

Last update: Nov 17, 2022

Overview

Speech Recognition for Uyghur using Speech transformer

Training:

this model using CTC loss and Cross Entropy loss for training.

unzip results.7z and thuyg20_data.7z to the same folder where python source files located. then run:

python train.py

Recognition:

for recognition download only pretrained model. then run:

python .\tonu.py .\test6.wav

result will be:

        Model loaded: results/UFormer_last.pth
            Best CER: 4.16%
             Trained: 276 epochs
The model has 36,418,306 trainable parameters
 Feature  has 25,869,058 trainable parameters
  Encoder has 4,205,568 trainable parameters
  Decoder has 6,343,680 trainable parameters

======================
Recognizing file .\test6.wav
test6.wav -> u qizlarning resimi chiqip qalsa bilekchila sinchilap qaraytti

This project using

A free Uyghur speech database Released by [email protected] University & Xinjiang University

Reference

https://github.com/gentaiscool/end2end-asr-pytorch

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

The PyTorch-Kaldi Speech Recognition Toolkit PyTorch-Kaldi is an open-source repository for developing state-of-the-art DNN/HMM speech recognition sys

2.3k Dec 27, 2022

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Espresso Espresso is an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning libra

919 Jan 3, 2023

Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

Lightning ASR Modular and extensible speech recognition library leveraging pytorch-lightning and hydra What is Lightning ASR • Installation • Get Star

40 Sep 19, 2022

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Command-line tools for speech and intent recognition on Linux

988 Jan 4, 2023

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

26 Dec 14, 2022

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform tasks on automatic speech recogniti

86 Jun 11, 2021

A fast and lightweight python-based CTC beam search decoder for speech recognition.

pyctcdecode A fast and feature-rich CTC beam search decoder for speech recognition written in Python, providing n-gram (kenlm) language model support

315 Dec 21, 2022

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

🤗 Contributing to OpenSpeech 🤗 OpenSpeech provides reference implementations of various ASR modeling papers and three languages recipe to perform ta

513 Jan 3, 2023

ExKaldi-RT: An Online Speech Recognition Extension Toolkit of Kaldi

ExKaldi-RT is an online ASR toolkit for Python language. It reads realtime streaming audio and do online feature extraction, probability computation, and online decoding.

31 Aug 16, 2021

Comments

W2Llayer

Dear Gheyret, Thanks for your work.

I spent some time today to try to figure out the source of this feature extraction layer, can you point me the paper/any reference on it?

I think it is a great design to extract speech features, so just want to understand it more deeply,

Thanks a lot,

Kelvin

opened by kelvinqin 2

Releases(premodel)

premodel(Jun 18, 2021)

Pretrained model.
Source code(tar.gz)
Source code(zip)
results.7z(131.19 MB)

Owner

Uyghur

GitHub Repository

End-to-End Speech Processing Toolkit

ESPnet: end-to-end speech processing toolkit system/pytorch ver. 1.0.1 1.1.0 1.2.0 1.3.1 1.4.0 1.5.1 1.6.0 1.7.1 1.8.1 ubuntu18/python3.8/pip ubuntu18

5.9k Jan 03, 2023

Proquabet - Convert your prose into proquints and then you essentially have Vogon poetry

Proquabet Turn your prose into a constant stream of encrypted and meaningless-so

2 Oct 10, 2022

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

ThinkTwice ThinkTwice is a retriever-reader architecture for solving long-text machine reading comprehension. It is based on the paper: ThinkTwice: A

4 Aug 06, 2021

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

ALBERT ***************New March 28, 2020 *************** Add a colab tutorial to run fine-tuning for GLUE datasets. ***************New January 7, 2020

3k Dec 26, 2022

A telegram bot to translate 100+ Languages

🔥 GOOGLE TRANSLATER 🔥 The owner would not be responsible for any kind of bans due to the bot. • ⚡ INSTALLING ⚡ • • 🔰 Deploy To Railway 🔰 • • ✅ OFF

5 Dec 20, 2021

HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

HiFi DeepVariant + WhatsHap workflow Workflow steps align HiFi reads to reference with pbmm2 call small variants with DeepVariant, using two-pass meth

2 May 14, 2022

Super easy library for BERT based NLP models

Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor

1.8k Dec 27, 2022

An assignment on creating a minimalist neural network toolkit for CS11-747

minnn by Graham Neubig, Zhisong Zhang, and Divyansh Kaushik This is an exercise in developing a minimalist neural network toolkit for NLP, part of Car

63 Dec 29, 2022

Unofficial PyTorch implementation of Google AI's VoiceFilter system

VoiceFilter Note from Seung-won (2020.10.25) Hi everyone! It's Seung-won from MINDs Lab, Inc. It's been a long time since I've released this open-sour

881 Jan 03, 2023

小布助手对话短文本语义匹配的一个baseline

oppo-text-match 小布助手对话短文本语义匹配的一个baseline 模型参考：https://kexue.fm/archives/8213 base版本线下大概0.952，线上0.866（单模型，没做K-flod融合）。训练测试环境：tensorflow 1.15 + keras

132 Dec 14, 2022

This is a modification of the OpenAI-CLIP repository of moein-shariatnia

2 Mar 04, 2022

NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

pretrain4ir_tutorial NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking 用作NLPIR实验室, Pre-training

12 Apr 07, 2022

Unsupervised intent recognition

INTENT author: steeve LAQUITAINE description: deployment pattern: currently batch only Setup & run git clone https://github.com/slq0/intent.git bash

1 Apr 08, 2022

"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

transformers-arithmetic This repository contains the code to reproduce the experiments from the paper: Nogueira, Jiang, Lin "Investigating the Limitat

33 Nov 16, 2022

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

English|简体中文 ERNIE是百度开创性提出的基于知识增强的持续学习语义理解框架，该框架将大数据预训练与多源丰富知识相结合，通过持续学习技术，不断吸收海量文本数据中词汇、结构、语义等方面的知识，实现模型效果不断进化。ERNIE在累积 40 余个典型 NLP 任务取得 SOTA 效果，并在 G

5.4k Jan 03, 2023

Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

Memorizing Transformers - Pytorch Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memori

364 Jan 06, 2023

Speech Recognition for Uyghur using Speech transformer

Related tags

Overview

Speech Recognition for Uyghur using Speech transformer

Training:

Recognition:

This project using

Reference

You might also like...

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

A fast and lightweight python-based CTC beam search decoder for speech recognition.

Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

ExKaldi-RT: An Online Speech Recognition Extension Toolkit of Kaldi

Comments

W2Llayer

Releases(premodel)

premodel(Jun 18, 2021)

Owner

Uyghur

End-to-End Speech Processing Toolkit

Proquabet - Convert your prose into proquints and then you essentially have Vogon poetry

ThinkTwice: A Two-Stage Method for Long-Text Machine Reading Comprehension

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

A telegram bot to translate 100+ Languages

HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

Super easy library for BERT based NLP models

An assignment on creating a minimalist neural network toolkit for CS11-747

Unofficial PyTorch implementation of Google AI's VoiceFilter system

小布助手对话短文本语义匹配的一个baseline

This is a modification of the OpenAI-CLIP repository of moein-shariatnia

NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

Unsupervised intent recognition

"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch

2021 AI CUP Competition on Traditional Chinese Scene Text Recognition - Intermediate Contest

ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)

OpenChat: Opensource chatting framework for generative models