easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Last update: May 24, 2022

Overview

easySpeech

easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Installation

You can install easySpeech very easily using the following command

pip3 install easySpeech

Usage

Using google speech to text api
By default easySpeech comes with a default api key which you can for testing purposes using the following code.

from easySpeech import speech
a=speech.speech('google')
print(a)

For production purpose use your own key because google can revoke the default api key at any time. Get your own api key from http://www.chromium.org/developers/how-tos/api-keys and use the following code

from easySpeech import speech
a=speech.speech('google',key="your api key")
print(a)

Specifying the duration of speech recognition in seconds(default value is 5 seconds)

from easySpeech import speech
a=speech.speech('google',duration = 10)
print(a)

Specifying the sample frequency(default is 44100)

from easySpeech import speech
a=speech.speech('google',duration = 10,freq = 44100)
print(a)

Specifying the language(works only for google speech api and default is english)

from easySpeech import speech
a=speech.speech('google',language="en-US")
print(a)

Converting an audio file to text(Currently it supports only wav file)

from easySpeech import speech
a=speech.google_audio('recording.wav')
print(a)

Using hugging face transformers(works offline and no need of any kind of api key) For using easySpeech with hugging face transformers use the following code.

from easySpeech import speech
a=speech.speech('ml')
print(a)

Specifying the duration of speech recognition in seconds(default valus is 5 seconds)

from easySpeech import speech
a=speech.speech('ml',duration = 10)
print(a)

Specifying the sample frequency(default is 44100)

from easySpeech import speech
a=speech.speech('ml',duration = 10,freq = 44100)
print(a)

Converting an audio file to text(Currently it supports only wav file)

from easySpeech import ml
a=ml.ml('recording.wav')
print(a)

Recording audio
For recording audio use the following code

from easySpeech import speech
speech.recorder('recording.wav')

For recording audio with a specific frequency use the following code(default is 44100)

from easySpeech import speech
speech.recorder('recording.wav',freq = 50000)

For recording audio for a specific duration use the following code(default is 5s)

from easySpeech import speech
speech.recorder('recording.wav',duration = 50)

How to contribute

Since it is a free software , you can contribute to make it better. New contributors are always welcome, whether you write code, create resources, report bugs, or suggest features.

The easySpeech is written primarily in Python3x

Have a look at the open issues to find a mission that resonates with you.

Contact

Email: [email protected]
If you find any bug make a issue immediately.

License

easySpeech is lisenced under MIT license

MIT License | Copyright (c) 2021 SaptakBhoumik

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software

Releases(v1.0.2)

v1.0.2(Jun 3, 2021)
easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers. You can also use it to record sound. What's new

It is now even more easy to use

Minor bug fix

Source code(tar.gz)
Source code(zip)
v1.0.1(Jun 1, 2021)

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers. You can also use it to record sound.
Source code(tar.gz)
Source code(zip)

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

186 Dec 24, 2022

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

The PyTorch-Kaldi Speech Recognition Toolkit PyTorch-Kaldi is an open-source repository for developing state-of-the-art DNN/HMM speech recognition sys

2.3k Dec 27, 2022

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul

1 Dec 20, 2021

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

3 Jul 28, 2021

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

PocketSphinx 5prealpha This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech r

3.2k Dec 28, 2022

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Related tags

Overview

easySpeech

Installation

Usage

How to contribute

Contact

License

You might also like...

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Command Line Text-To-Speech using Google TTS

Releases(v1.0.2)

v1.0.2(Jun 3, 2021)

v1.0.1(Jun 1, 2021)

Owner

Saptak Bhoumik

A multi-voice TTS system trained with an emphasis on quality

Snips Python library to extract meaning from text

多语言降噪预训练模型MBart的中文生成任务

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

Arabic speech recognition, classification and text-to-speech.

Use fastai-v2 with HuggingFace's pretrained transformers

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset.

Trained T5 and T5-large model for creating keywords from text

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

In this Notebook I've build some machine-learning and deep-learning to classify corona virus tweets, in both multi class classification and binary classification.

PG-19 Language Modelling Benchmark

vits chinese, tts chinese, tts mandarin

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

nlpcommon is a python Open Source Toolkit for text classification.

this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

TPlinker for NER 中文/英文命名实体识别

Simple text to phones converter for multiple languages

Trex is a tool to match semantically similar functions based on transfer learning.

Weakly-supervised Text Classification Based on Keyword Graph