Persian Kaldi profile for Rhasspy built from open speech data

Last update: Aug 08, 2022

Related tags

Miscellaneous fa_kaldi-rhasspy

Overview

Persian Kaldi Profile

A Rhasspy profile for Persian (fa).

Installation

Get started by first installing Vosk:

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip3 install --upgrade pip
pip3 install --upgrade wheel setuptools

# Install Vosk
pip3 install vosk

Next, download the model and extract it:

wget 'https://github.com/rhasspy/fa_kaldi-rhasspy/releases/download/v1.0/vosk-model-small-fa-rhasspy-0.15.zip'
unzip vosk-model-small-fa-rhasspy-0.15.zip

Finally, run the transcribe.py Python program with the model and an audio file:

python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 welcome.wav

{"result": [{"conf": 1.0, "end": 0.48, "start": 0.06, "word": "خوش"}, {"conf": 1.0, "end": 1.11, "start": 0.48, "word": "آمدید"}], "text": "خوش آمدید"}

For each audio file given to transcribe.py, a line of JSON will be printed in the output with the transcription details.

You might also like...

Service for working with open data of the State Duma of the Russian Federation

Сервис для работы с открытыми данными Госдумы РФ Исходные данные из API Госдумы РФ извлекаются с помощью Apache Nifi и приземляются в хранилище Clickh

2 Feb 14, 2022

Driving lessons made simpler. Custom scheduling API built with Python.

NOTE This is a mirror of a GitLab repository. Dryvo Dryvo is a unique solution for the driving lessons industry. Our aim is to save the teacher’s time

595 Dec 5, 2022

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

64 Sep 28, 2022

This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

Python Projects {Open Source} Introduction The repository was built with a tree-like structure in mind, it contains collections of Python Projects. Mo

115 Apr 30, 2022

This is a repository for voting software built using Choice Coin on the Algorand Network.

A repository for voting systems using Choice Coin.

633 Dec 23, 2022

Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing method and position changing method.

Three-different-method-for-list-reversion Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing met

4 Sep 24, 2021

Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

🌍 Take back your privacy with Dot Browser, the privacy-conscious web browser that protects you from being tracked and monitored online.

1k Jan 7, 2023

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

guess-the-numbers Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Number guessing game

5 Oct 9, 2021

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

password-generator Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Password generator

3 Oct 9, 2021

Comments

PySoundFile failed. Trying audioread instead.
I just tried to run this command: python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 MyFile.mp3

and got this error:

/your/path/.venv/lib/python3.9/site-packages/librosa/util/decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead. return f(*args, **kwargs)

Thank you so much
opened by GameO7er 1
ModuleNotFoundError: No module named 'librosa'
I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

Traceback (most recent call last): File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module> import librosa ModuleNotFoundError: No module named 'librosa'

Thank you so much.
opened by GameO7er 1
ModuleNotFoundError: No module named 'numpy'
I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

Traceback (most recent call last): File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module> import librosa ModuleNotFoundError: No module named 'numpy'

Thank you so much.
opened by GameO7er 1

Error using recipes

Hello, Thanks for you great work for sharing this useful repo. I tried to use your recipes to train Persian data. In run.sh file, an error ocurred while adapting lm.arpa and creating G.fst:

creating G.fst...
arpa2fst -
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
ERROR: FstHeader::Read: Bad FST header: standard input

full run.sh output is:

Runtime configuration is: nJobs 12, nDecodeJobs 12. If this is not what you want, edit cmd.sh
Starting at stage 0, train_stage -10

Prepare phoneme data for Kaldi

utils/prepare_lang.sh data/local/dict <unk> data/local/lang data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/silence_phones.txt is OK

Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/optional_silence.txt is OK

Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.

Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/lexicon.txt is OK

Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]

**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
prepare_lang.sh: validating output directory
utils/validate_lang.pl data/lang
Checking existence of separator file
separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
Checking data/lang/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/phones.txt is OK

Checking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txt

Checking data/lang/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 15 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OK

Checking data/lang/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 116 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 15 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OK

Checking data/lang/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 14 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OK

Checking data/lang/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 32 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OK

Checking data/lang/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 32 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OK

Checking data/lang/phones/extra_questions.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 11 entry/entries in data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.{txt, int} are OK

Checking data/lang/phones/word_boundary.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 131 entry/entries in data/lang/phones/word_boundary.txt
--> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
--> data/lang/phones/word_boundary.{txt, int} are OK

Checking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OK

Checking topo ...

Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
--> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
--> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
--> data/lang/phones/word_boundary.txt is OK

Checking word-level disambiguation symbols...
--> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking word_boundary.int and disambig.int
--> generating a 35 word/subword sequence
--> resulting phone sequence from L.fst corresponds to the word sequence
--> L.fst is OK
--> generating a 45 word/subword sequence
--> resulting phone sequence from L_disambig.fst corresponds to the word sequence
--> L_disambig.fst is OK

Checking data/lang/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK

--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]

adapt our LM for kaldi...


creating G.fst...
arpa2fst -
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
ERROR: FstHeader::Read: Bad FST header: standard input

make mfcc

fix_data_dir.sh: kept all 12394 utterances.
fix_data_dir.sh: old files are kept in data/train/.backup
mkdir: cannot create directory 'data/train/wav.scp': File exists
steps/make_mfcc.sh --cmd utils/run.pl --nj 12 data/train exp/make_mfcc_chain/train mfcc_chain
utils/validate_data_dir.sh: Successfully validated data-directory data/train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.

can you please help me fix this issue? thanks

opened by MahdiEsrafili 0

Persian Kaldi profile for Rhasspy built from open speech data

Related tags

Overview

Persian Kaldi Profile

Installation

You might also like...

Service for working with open data of the State Duma of the Russian Federation

Driving lessons made simpler. Custom scheduling API built with Python.

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

This is a repository for voting software built using Choice Coin on the Algorand Network.

Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing method and position changing method.

Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

Comments

PySoundFile failed. Trying audioread instead.

ModuleNotFoundError: No module named 'librosa'

ModuleNotFoundError: No module named 'numpy'

Error using recipes

Releases(v1.0)

v1.0(Oct 9, 2021)

Owner

Rhasspy

Python library for creating and parsing HSReplay XML files

Example applications, dashboards, scripts, notebooks, and other utilities built using Polygon.io

CMPE 204 Modelling Project

A deployer and package manager for OceanBase open-source software.

Run-Your-Own Firefox Sync Server

A free micro-blog written in Python and powered by Heroku. *Merge requests are appreciated!*

A scuffed remake of Kahoot... Made by Y9 and Y10 SHSB

Organize seu linux - organize your linux

A platform for developers 👩‍💻 who wants to share their programs and projects.

FindUncommonShares.py is a Python equivalent of PowerView's Invoke-ShareFinder.ps1 allowing to quickly find uncommon shares in vast Windows Domains.

This is a multi-app executor that it used when we have some different task in a our applications and want to run them at the same time

Show Public IP Information In Linux Taskbar

Library for Memory Trace Statistics in Python

Homed - Light-weight, easily configurable, dockerized homepage

Huggingface package for the discrete VAE used for DALL-E.

The official repository of iGEM Paris Bettencourt team's software tools.

A python script for practicing Toki Pona.

Small scripts to learn about GNOME internals

A simple chatbot that I made for school project

A faster copy of nell's comet nuker

A free micro-blog written in Python and powered by Heroku. Merge requests are appreciated!