Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Last update: Nov 16, 2022

Overview

Neural G2P to portuguese language

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly essential role for natural language processing, text-to-speech synthesis and automatic speech recognition systems. This project was adapted from https://github.com/hajix/G2P.

Dependencies

The following libraries are used:
pytorch
tqdm
matplotlib

Install dependencies using pip:

pip3 install -r requirements.txt

Dataset

The dataset used here was taken from site http://www.portaldalinguaportuguesa.org/, as well as some insertions made by me so that the dataset would give more coverage to common words in the daily life of the Brazilian Portuguese. Some ambiguities were also resolved as the intent of this dataset is to contain a specific speaker bias. The dictionary based on São Paulo speakers was chosen.

As in https://github.com/hajix/G2P, on which this implementation was based, you could easily provide and use your own language specific pronunciatin doctionary for training G2P. More details about data preparation and contribution could be found in resources.
Feel free to provide resources for other languages.

Attention Model

Both encoder-decoder seq2seq model and attention model could handle G2P problem. Here we train attention based model. The encoder model get sequence of graphemes and produces states at each timestep. Encoder states used during attention decoding. The decoder attends to appropriate encoder state (according to its state) and produces phonemes.

Train

To start training the model run:

python train.py

You can also use tensorboard to check the training loss:

tensorboard --logdir log --bind_all

Training parameters could be found at config.py.

Inference

To get pronunciation of a word:

# PT-BR example
python inference.py --sentence 'olá, vamos testar esse projeto.'
o|l|a| |,| |v|a|m|ʊ|s| |t|e|s|t|a| |e|s|i| |p|ɾ|o|ʒ|e|t|ʊ| |.

You could also visualize the attention weights, using --visualize:

# PT-BR example
python inference.py --visualize --sentence 'olá, vamos testar esse projeto.'
o|l|a| |,| |v|a|m|ʊ|s| |t|e|s|t|a| |e|s|i| |p|ɾ|o|ʒ|e|t|ʊ| |.

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Related tags

Overview

Neural G2P to portuguese language

Dependencies

Dataset

Attention Model

Train

Inference

Owner

fluz

Unsupervised Language Model Pre-training for French

Simple, hackable offline speech to text - using the VOSK-API.

official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]

Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Pytorch-Named-Entity-Recognition-with-BERT

Automatically search Stack Overflow for the command you want to run

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

ADCS cert template modification and ACL enumeration

Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

null

CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset.

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

DeepPavlov Tutorials

Python library to make development of portfolio analysis faster and easier

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

💛 Code and Dataset for our EMNLP 2021 paper: "Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes"

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡