This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Last update: Dec 04, 2022

Related tags

Text Data & NLP proteno

Overview

Proteno

This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems (https://arxiv.org/abs/2104.07777)

Security

See CONTRIBUTING for more information.

License

This project is released under CC-BY-NC-4.0 and other licenses:

English: CC-BY-SA
Spanish: CC-BY-SA
Tamil: CC-BY-NC-SA

Citation

If you use our data, please cite the following paper:

@inproceedings{tyagi-etal-2021-proteno,
    title = "Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems",
    author = "Tyagi, Shubhi  and
      Bonafonte, Antonio  and
      Lorenzo-Trueba, Jaime  and
      Latorre, Javier",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-industry.10",
    pages = "72--79",
}

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Related tags

Overview

Proteno

Security

License

Citation

Owner

The swas programming language

Reformer, the efficient Transformer, in Pytorch

This is a modification of the OpenAI-CLIP repository of moein-shariatnia

leaking paid token generator that was a shit lmao for 100$ haha

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

customer care chatbot made with Rasa Open Source.

Search Git commits in natural language

Generating Korean Slogans with phonetic and structural repetition

A BERT-based reverse dictionary of Korean proverbs

Deep learning for NLP crash course at ABBYY.

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

PortaSpeech - PyTorch Implementation

ACL'2021: Learning Dense Representations of Phrases at Scale

This repository contains Python scripts for extracting linguistic features from Filipino texts.

Word Bot for JKLM Bomb Party

Python library to make development of portfolio analysis faster and easier

Backend for the Autocomplete platform. An AI assisted coding platform.

All the code I wrote for Overwatch-related projects that I still own the rights to.

A fast and lightweight python-based CTC beam search decoder for speech recognition.

This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.