Azure Text-to-speech service for Home Assistant

Overview

hacs_badge

Azure Text-to-speech service for Home Assistant

The Azure text-to-speech platform uses online Azure Text-to-Speech cognitive service to read a text with natural sounding voice.

The main reason behind this custom integration is to decouple the Microsoft TTS service from the python library pycsspeechtts used by the "official" integration.

This integration uses the native Azure Cognitive Speech Service Text-to-speech REST API (I know.. it is too long for a service name).

Features

  • Supports multi language. You can find the full list of languages here.
  • Supports SSML.

Basic Configuration

# Text to speech
tts:
  - platform: azure_tts
    service_name: azure_say
    api_key: <your_api_key>

Configuration variables

This integration accepts the same configuration variables as the out-of-the-box Microsoft TTS].

You might also like...
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Proteno This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deploymen

Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration This repo contains only model Implementation of Zero-Shot Text-to-Speech for Text

glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end.

Glow-Speak glow-speak is a fast, local, neural text to speech system that uses eSpeak-ng as a text/phoneme front-end. Installation git clone https://g

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

⚠️ Checkout develop branch to see what is coming in pyannote.audio 2.0: a much smaller and cleaner codebase Python-first API (the good old pyannote-au

In this repository, I have developed an end to end Automatic speech recognition project. I have developed the neural network model for automatic speech recognition with PyTorch and used MLflow to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

Text-Summarization-using-NLP - Text Summarization using NLP  to fetch BBC News Article and summarize its text and also it includes custom article Summarization
Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS) Yoonhyung Lee, Joongbo Shin, Kyomin Jung Abstract: Although early

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

Comments
  • init and concatenate str error

    init and concatenate str error

    Hi, i got two errors with your integration: my configuration.yaml is:

        #https://github.com/yassineselmi/homeassistant-azure-tts
      - platform: azure_tts
        service_name: tts_microsoft_noemi_notok
        cache: false
        api_key: ####################
        language: hu-HU
        gender: Female
        #type: hu-HU-NoemiNeural
        type: NoemiNeural
        rate: 100
        volume: 100
        pitch: default
        contour: (0, 0) (100, 100)
        region: westeurope
    

    my automation is:

    alias: Announcement, Time (Microsoft)
    description: ''
    trigger:
      - platform: time_pattern
        minutes: /15
    condition: []
    action:
      - service: tts.tts_microsoft_noemi_notok
        data:
          entity_id: media_player.living_room_speaker, media_player.bedroom_speaker
          message: {{ now().hour}} óra {{ "%0.02d" | format(now().strftime("%-M") | int) }} perc
    mode: single
    

    Error1

    Error on init TTS: No TTS from azure_tts for 'message: 20 óra 30 perc'
    8:30:51 PM – (ERROR) Text-to-Speech (TTS)
    
    Logger: homeassistant.components.tts
    Source: components/tts/__init__.py:188
    Integration: Text-to-Speech (TTS) (documentation, issues)
    First occurred: 8:30:51 PM (1 occurrences)
    Last logged: 8:30:51 PM
    
    Error on init TTS: No TTS from azure_tts for 'message: 20 óra 30 perc'
    

    Error2

    Error occurred for Azure TTS: can only concatenate str (not "bytes") to str
    8:30:51 PM – (ERROR) azure_tts (custom integration)
    
    Logger: custom_components.azure_tts.tts
    Source: custom_components/azure_tts/tts.py:415
    Integration: azure_tts (documentation, issues)
    First occurred: 8:30:51 PM (1 occurrences)
    Last logged: 8:30:51 PM
    
    Error occurred for Azure TTS: can only concatenate str (not "bytes") to str
    

    do you have a solution for this issue?

    also id like to change the ptch of the voice a bit deeper, and at sample site (microsoft) and in azur, its posible to change this attribute. id like to use 0.9 for pitch and 1.2 for speed

    Thanks, Zoltan

    ps: with his integration it works: https://github.com/georgezhao2010/azure_cognitive_speech

      - platform: azure_cognitive_speech
        service_name: tts_microsoft_noemi
        cache: false
        api_key: #############
        region: westeurope
        default_voice: Noemi
    
    opened by vzoltan 2
Releases(0.1.2)
Owner
Yassine Selmi
DevOps, Architect. Python guru
Yassine Selmi
Machine learning models from Singapore's NLP research community

SG-NLP Machine learning models from Singapore's natural language processing (NLP) research community. sgnlp is a Python package that allows you to eas

AI Singapore | AI Makerspace 21 Dec 17, 2022
Chinese NER with albert/electra or other bert descendable model (keras)

Chinese NLP (albert/electra with Keras) Named Entity Recognization Project Structure ./ ├── NER │   ├── __init__.py │   ├── log

2 Nov 20, 2022
A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.

Crosslingual Coreference Coreference is amazing but the data required for training a model is very scarce. In our case, the available training for non

Pandora Intelligence 71 Jan 04, 2023
Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies

PyStanfordDependencies Python interface for converting Penn Treebank trees to Universal Dependencies and Stanford Dependencies. Example usage Start by

David McClosky 64 May 08, 2022
Programme de chiffrement et de déchiffrement inverse d'un message en python3.

Chiffrement Inverse En Python3 Programme de chiffrement et de déchiffrement inverse d'un message en python3. Explication du chiffrement inverse avec c

Malik Makkes 2 Mar 26, 2022
Uses Google's gTTS module to easily create robo text readin' on command.

Tool to convert text to speech, creating files for later use. TTRS uses Google's gTTS module to easily create robo text readin' on command.

0 Jun 20, 2021
The code from the whylogs workshop in DataTalks.Club on 29 March 2022

whylogs Workshop The code from the whylogs workshop in DataTalks.Club on 29 March 2022 whylogs - The open source standard for data logging (Don't forg

DataTalksClub 12 Sep 05, 2022
Intent parsing and slot filling in PyTorch with seq2seq + attention

PyTorch Seq2Seq Intent Parsing Reframing intent parsing as a human - machine translation task. Work in progress successor to torch-seq2seq-intent-pars

Sean Robertson 159 Apr 04, 2022
SpikeX - SpaCy Pipes for Knowledge Extraction

SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.

Erre Quadro Srl 384 Dec 12, 2022
A very simple framework for state-of-the-art Natural Language Processing (NLP)

A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. IMPORTANT: (30.08.2020) We moved our models

flair 12.3k Dec 31, 2022
This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

Project: Text Analysis - This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 -

1 Mar 14, 2022
Knowledge Graph,Question Answering System,基于知识图谱和向量检索的医疗诊断问答系统

Knowledge Graph,Question Answering System,基于知识图谱和向量检索的医疗诊断问答系统

wangle 823 Dec 28, 2022
RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

RuCLIPtiny Zero-shot image classification model for Russian language RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network

Shahmatov Arseniy 26 Sep 20, 2022
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

ALBERT ***************New March 28, 2020 *************** Add a colab tutorial to run fine-tuning for GLUE datasets. ***************New January 7, 2020

Google Research 3k Dec 26, 2022
Multi Task Vision and Language

12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-

Meta Research 711 Jan 08, 2023
PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

Cross-Covariance Image Transformer (XCiT) PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer L

Facebook Research 605 Jan 02, 2023
Easy-to-use CPM for Chinese text generation

CPM 项目描述 CPM(Chinese Pretrained Models)模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型,参数量分别为109M、334M、2.6B,用户需申请与通过审核,方可下载。 由于原项目需要考虑大模型的训练和使用,需要安装较为复杂

382 Jan 07, 2023
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform This repo try to implement iSTFTNet : Fast

Rishikesh (ऋषिकेश) 126 Jan 02, 2023
A website which allows you to play with the GPT-2 transformer

transformers A website which allows you to play with the GPT-2 model Built with ❤️ by raphtlw Table of contents Model Setup About Contributors Model T

raphtlw 2 Jan 27, 2022
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

TextBlob: Simplified Text Processing Homepage: https://textblob.readthedocs.io/ TextBlob is a Python (2 and 3) library for processing textual data. It

Steven Loria 8.4k Dec 26, 2022