Persian Kaldi profile for Rhasspy built from open speech data

Overview

Persian Kaldi Profile

A Rhasspy profile for Persian (fa).

Installation

Get started by first installing Vosk:

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip3 install --upgrade pip
pip3 install --upgrade wheel setuptools

# Install Vosk
pip3 install vosk

Next, download the model and extract it:

wget 'https://github.com/rhasspy/fa_kaldi-rhasspy/releases/download/v1.0/vosk-model-small-fa-rhasspy-0.15.zip'
unzip vosk-model-small-fa-rhasspy-0.15.zip

Finally, run the transcribe.py Python program with the model and an audio file:

python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 welcome.wav

{"result": [{"conf": 1.0, "end": 0.48, "start": 0.06, "word": "خوش"}, {"conf": 1.0, "end": 1.11, "start": 0.48, "word": "آمدید"}], "text": "خوش آمدید"}

For each audio file given to transcribe.py, a line of JSON will be printed in the output with the transcription details.

You might also like...
Service for working with open data of the State Duma of the Russian Federation
Service for working with open data of the State Duma of the Russian Federation

Сервис для работы с открытыми данными Госдумы РФ Исходные данные из API Госдумы РФ извлекаются с помощью Apache Nifi и приземляются в хранилище Clickh

Driving lessons made simpler. Custom scheduling API built with Python.
Driving lessons made simpler. Custom scheduling API built with Python.

NOTE This is a mirror of a GitLab repository. Dryvo Dryvo is a unique solution for the driving lessons industry. Our aim is to save the teacher’s time

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

Ikaros is a free financial library built in pure python that can be used to get information for single stocks, generate signals and build prortfolios

This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.
This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

Python Projects {Open Source} Introduction The repository was built with a tree-like structure in mind, it contains collections of Python Projects. Mo

Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing method and position changing method.

Three-different-method-for-list-reversion Here, I have discuss the three methods of list reversion. The three methods are built-in method, slicing met

Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.
Dot Browser is a privacy-conscious web browser with smarts built-in for protection against trackers and advertisments online.

🌍 Take back your privacy with Dot Browser, the privacy-conscious web browser that protects you from being tracked and monitored online.

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls
Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

guess-the-numbers Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Number guessing game

Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls
Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls

password-generator Built with Python programming language and QT library and Guess the number in three easy, medium and hard rolls Password generator

Comments
  •  PySoundFile failed. Trying audioread instead.

    PySoundFile failed. Trying audioread instead.

    I just tried to run this command: python3 transcribe.py vosk-model-small-fa-rhasspy-0.15 MyFile.mp3

    and got this error:

    /your/path/.venv/lib/python3.9/site-packages/librosa/util/decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead.
      return f(*args, **kwargs)  
    

    Thank you so much

    opened by GameO7er 1
  • ModuleNotFoundError: No module named 'librosa'

    ModuleNotFoundError: No module named 'librosa'

    I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

    Traceback (most recent call last):
      File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module>
        import librosa
    ModuleNotFoundError: No module named 'librosa'
    

    Thank you so much.

    opened by GameO7er 1
  • ModuleNotFoundError: No module named 'numpy'

    ModuleNotFoundError: No module named 'numpy'

    I got this error when I just did follow your instruction in the Readme.md line by line. So I thought maybe this help others for running the script successfully.

    Traceback (most recent call last):
      File "/home/gameover/Projects/Python/Rhaspy/transcribe.py", line 8, in <module>
        import librosa
    ModuleNotFoundError: No module named 'numpy'
    

    Thank you so much.

    opened by GameO7er 1
  • Error using recipes

    Error using recipes

    Hello, Thanks for you great work for sharing this useful repo. I tried to use your recipes to train Persian data. In run.sh file, an error ocurred while adapting lm.arpa and creating G.fst:

    creating G.fst...
    arpa2fst -
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
    FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
    ERROR: FstHeader::Read: Bad FST header: standard input
    

    full run.sh output is:

    Runtime configuration is: nJobs 12, nDecodeJobs 12. If this is not what you want, edit cmd.sh
    Starting at stage 0, train_stage -10
    
    Prepare phoneme data for Kaldi
    
    utils/prepare_lang.sh data/local/dict <unk> data/local/lang data/lang
    Checking data/local/dict/silence_phones.txt ...
    --> reading data/local/dict/silence_phones.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/silence_phones.txt is OK
    
    Checking data/local/dict/optional_silence.txt ...
    --> reading data/local/dict/optional_silence.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/optional_silence.txt is OK
    
    Checking data/local/dict/nonsilence_phones.txt ...
    --> reading data/local/dict/nonsilence_phones.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/nonsilence_phones.txt is OK
    
    Checking disjoint: silence_phones.txt, nonsilence_phones.txt
    --> disjoint property is OK.
    
    Checking data/local/dict/lexicon.txt
    --> reading data/local/dict/lexicon.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/lexicon.txt is OK
    
    Checking data/local/dict/extra_questions.txt ...
    --> reading data/local/dict/extra_questions.txt
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/local/dict/extra_questions.txt is OK
    --> SUCCESS [validating dictionary directory data/local/dict]
    
    **Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
    fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int
    prepare_lang.sh: validating output directory
    utils/validate_lang.pl data/lang
    Checking existence of separator file
    separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case.
    Checking data/lang/phones.txt ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/lang/phones.txt is OK
    
    Checking words.txt: #0 ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> data/lang/words.txt is OK
    
    Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
    --> silence.txt and nonsilence.txt are disjoint
    --> silence.txt and disambig.txt are disjoint
    --> disambig.txt and nonsilence.txt are disjoint
    --> disjoint property is OK
    
    Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
    --> found no unexplainable phones in phones.txt
    
    Checking data/lang/phones/context_indep.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 15 entry/entries in data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
    --> data/lang/phones/context_indep.{txt, int, csl} are OK
    
    Checking data/lang/phones/nonsilence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 116 entry/entries in data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
    --> data/lang/phones/nonsilence.{txt, int, csl} are OK
    
    Checking data/lang/phones/silence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 15 entry/entries in data/lang/phones/silence.txt
    --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
    --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
    --> data/lang/phones/silence.{txt, int, csl} are OK
    
    Checking data/lang/phones/optional_silence.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 1 entry/entries in data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.{txt, int, csl} are OK
    
    Checking data/lang/phones/disambig.{txt, int, csl} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 14 entry/entries in data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
    --> data/lang/phones/disambig.{txt, int, csl} are OK
    
    Checking data/lang/phones/roots.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 32 entry/entries in data/lang/phones/roots.txt
    --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
    --> data/lang/phones/roots.{txt, int} are OK
    
    Checking data/lang/phones/sets.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 32 entry/entries in data/lang/phones/sets.txt
    --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
    --> data/lang/phones/sets.{txt, int} are OK
    
    Checking data/lang/phones/extra_questions.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 11 entry/entries in data/lang/phones/extra_questions.txt
    --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
    --> data/lang/phones/extra_questions.{txt, int} are OK
    
    Checking data/lang/phones/word_boundary.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 131 entry/entries in data/lang/phones/word_boundary.txt
    --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt
    --> data/lang/phones/word_boundary.{txt, int} are OK
    
    Checking optional_silence.txt ...
    --> reading data/lang/phones/optional_silence.txt
    --> data/lang/phones/optional_silence.txt is OK
    
    Checking disambiguation symbols: #0 and #1
    --> data/lang/phones/disambig.txt has "#0" and "#1"
    --> data/lang/phones/disambig.txt is OK
    
    Checking topo ...
    
    Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ...
    --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols
    --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt
    --> data/lang/phones/word_boundary.txt is OK
    
    Checking word-level disambiguation symbols...
    --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
    Checking word_boundary.int and disambig.int
    --> generating a 35 word/subword sequence
    --> resulting phone sequence from L.fst corresponds to the word sequence
    --> L.fst is OK
    --> generating a 45 word/subword sequence
    --> resulting phone sequence from L_disambig.fst corresponds to the word sequence
    --> L_disambig.fst is OK
    
    Checking data/lang/oov.{txt, int} ...
    --> text seems to be UTF-8 or ASCII, checking whitespaces
    --> text contains only allowed whitespaces
    --> 1 entry/entries in data/lang/oov.txt
    --> data/lang/oov.int corresponds to data/lang/oov.txt
    --> data/lang/oov.{txt, int} are OK
    
    --> data/lang/L.fst is olabel sorted
    --> data/lang/L_disambig.fst is olabel sorted
    --> SUCCESS [validating lang directory data/lang]
    
    adapt our LM for kaldi...
    
    
    creating G.fst...
    arpa2fst -
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:94) Reading \data\ section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
    LOG (arpa2fst[5.5.0~1-2b62]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
    FATAL: FstCompiler: Bad number of columns, source = standard input, line = 28129
    ERROR: FstHeader::Read: Bad FST header: standard input
    
    make mfcc
    
    fix_data_dir.sh: kept all 12394 utterances.
    fix_data_dir.sh: old files are kept in data/train/.backup
    mkdir: cannot create directory 'data/train/wav.scp': File exists
    steps/make_mfcc.sh --cmd utils/run.pl --nj 12 data/train exp/make_mfcc_chain/train mfcc_chain
    utils/validate_data_dir.sh: Successfully validated data-directory data/train
    steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
    

    can you please help me fix this issue? thanks

    opened by MahdiEsrafili 0
Owner
Rhasspy
Offline voice assistant
Rhasspy
Resizing using nnedi3/znedi3/nnedi3cl with center alignment and correct chroma placement

nnedi3_resample A VapourSynth script for easy resizing using nnedi3/znedi3/nnedi3cl with center alignment and correct chroma placement. Requirements n

Home Of VapourSynth Evolution 12 Sep 08, 2022
An kind of operating system portal to a variety of apps with pure python

pyos An kind of operating system portal to a variety of apps. Installation Run this on your terminal: git clone https://github.com/arjunj132/pyos.git

1 Jan 22, 2022
Pytorch implementation of "Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates"

Peer Loss functions This repository is the (Multi-Class & Deep Learning) Pytorch implementation of "Peer Loss Functions: Learning from Noisy Labels wi

Kushal Shingote 1 Feb 08, 2022
x-tools is a collection of tools developed in Python

x-tools X-tools is a collection of tools developed in Python Commands\

5 Jan 24, 2022
SWS Filters App - SWS Filters App With Python

SWS Filters App Fun 😅 ... Fun 😅 Click On photo and see 😂 😂 😂 Your Video rec

Sagar Jangid 3 Jul 07, 2022
Two predictive attributes (Speed and Angle) and one attribute target (Power)

Two predictive attributes (Speed and Angle) and one attribute target (Power). A container crane has the function of transporting containers from one point to another point. The difficulty of this tas

Astitva Veer Garg 1 Jan 11, 2022
Completed task 1 and task 2 at LetsGrowMore as a data science intern.

LetsGrowMore-Internship Completed task 1 and task 2 at LetsGrowMore as a data science intern. Task 1- Task 2- Creating a Decision Tree classifier and

Sanjyot Panure 1 Jan 16, 2022
A python tool for synchronizing the messages from different threads, processes, or hosts.

Sync-stream This project is designed for providing the synchoronization of the stdout / stderr among different threads, processes, devices or hosts.

Yuchen Jin 0 Aug 11, 2021
Sudoku solver using backtracking

Sudoku solver Sudoku solver using backtracking Basically in sudoku, we want to be able to solve a sudoku puzzle given an input like this, which repres

Kylie 99 Jan 07, 2023
Oregon State University grade distributions from Fall 2018 through Summer 2021

Oregon State University Grades Oregon State University grade distributions from Fall 2018 through Summer 2021 obtained through a Freedom Of Informatio

Melanie Gutzmann 5 May 02, 2022
Ghost source since the developer of the project quit due to reasons

👻 Ghost Selfbot The official code for Ghost which was recently discontinued and released to the public. Feel free to use any of the code found in thi

xannyy 2 Mar 24, 2022
Blender Add-on to Add Metal Materials to Your Scene

Blender QMM (Quick Metal Materials) Blender Addon to Add Metal Materials to Your Scene Installation Download the latest ZIP from Releases. Usage This

Don Schnitzius 27 Dec 26, 2022
Set of tools to analyze Tinynuke samples

tinynuke-toolset You'll find in that repository a set of tools and scripts I developped to analyze Tinynuke samples. Dll extractor: script used to ext

Heat Miser 14 Aug 18, 2022
Sheet2export - FreeCAD macro to export spreadsheet

Description This is FreeCAD macro to export spreadsheet to file.

Darek L 3 Jul 09, 2022
Regular Expressions - Use regular expressions to detect date format

A list of all the resources used https://regex101.com/ - To test regex https://w

Ravika Nagpal 1 Jan 04, 2022
An OrpheusDL Tidal module

OrpheusDL - Tidal A Tidal module for the OrpheusDL modular archival music program Report Bug · Request Feature Table of content About OrpheusDL - Tida

Daniel 54 Dec 29, 2022
An app to help people apply for admissions on schools/hostels

Admission-helper About An app to help people apply for admissions on schools/hostels This app is a rewrite of Admission-helper-beta-v5.8.9 and I impor

Advik 3 Apr 24, 2022
Skull shaped MOSFET cells for the Efabless's 130nm process

SkullFET Skull shaped MOSFET cells for the Efabless's 130nm process List of cells Inverter Copyright (C) 2021 Uri Shaked

Wokwi 3 Dec 14, 2022
A collection of python exercises to help your learning path!

How to use Step 1: run this command git clone https://github.com/TechPenguineer/Python-Exercises.git Step 2: Run this command cd Python-Exercises You

Tech Penguin 5 Aug 05, 2021
Moleey Panel with python 3

Painel-Moleey pkg upgrade && pkg update pkg install python3 pip install pyfiglet pip install colored pip install requests pip install phonenumbers pkg

Moleey. 1 Oct 17, 2021