無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

Last update: Jul 05, 2022

Related tags

Text Data & NLP voicevox_engine

Overview

VOICEVOX ENGINE

VOICEVOXの音声合成エンジン。実態は HTTP サーバーなので、リクエストを送信すればテキスト音声合成できます。

API ドキュメント

VOICEVOX ソフトウェアを起動した状態で、ブラウザから http://localhost:50021/docs にアクセスするとドキュメントが表示されます。
VOICEVOX 音声合成エンジンとの連携も参考になるかもしれません。

HTTP リクエストで音声合成するサンプルコード

query.json curl -s \ -H "Content-Type: application/json" \ -X POST \ -d @query.json \ localhost:50021/synthesis?speaker=1 \ > audio.wav ">

text="ABCDEFG"

curl -s \
    -X POST \
    "localhost:50021/audio_query?text=$text&speaker=1"\
    > query.json

curl -s \
    -H "Content-Type: application/json" \
    -X POST \
    -d @query.json \
    localhost:50021/synthesis?speaker=1 \
    > audio.wav

貢献者の方へ

Issue を解決するプルリクエストを作成される際は、別の方と同じ Issue に取り組むことを避けるため、 Issue 側で取り組み始めたことを伝えるか、最初に Draft プルリクエストを作成してください。

環境構築

# 開発に必要なライブラリのインストール
pip install -r requirements-test.txt

# とりあえず実行したいだけなら代わりにこちら
pip install -r requirements.txt

実行

# 製品版 VOICEVOX でサーバーを起動
VOICEVOX_DIR="C:/path/to/voicevox" # 製品版 VOICEVOX ディレクトリのパス
python run.py --voicevox_dir=$VOICEVOX_DIR

# モックでサーバー起動
python run.py

コードフォーマット

コードのフォーマットを整えます。プルリクエストを送る前に実行してください。

pysen run format lint

ビルド

Build Tools for Visual Studio 2019 が必要です。

pip install -r requirements-dev.txt

python -m nuitka \
    --standalone \
    --plugin-enable=numpy \
    --follow-import-to=numpy \
    --follow-import-to=aiofiles \
    --include-package=uvicorn \
    --include-package-data=pyopenjtalk \
    --include-data-file=VERSION.txt=./ \
    --include-data-file=speakers.json=./ \
    --include-data-file=C:/音声ライブラリへのパス/Release/*.dll=./ \
    --include-data-file=C:/音声ライブラリへのパス/*.bin=./ \
    --include-data-dir=.venv/Lib/site-packages/_soundfile_data=./_soundfile_data \
    --msvc=14.2 \
    --follow-imports \
    --no-prefer-source-code \
    run.py

ライセンス

LGPL v3 と、ソースコードの公開が不要な別ライセンスのデュアルライセンスです。別ライセンスを取得したい場合は、ヒホ（twitter: @hiho_karuta）に求めてください。

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

Related tags

Overview

VOICEVOX ENGINE

API ドキュメント

HTTP リクエストで音声合成するサンプルコード

貢献者の方へ

環境構築

実行

コードフォーマット

ビルド

ライセンス

You might also like...

Releases(check-code-sign-8)

check-code-sign-8(Jul 10, 2022)

Owner

Hiroshiba

A simple Streamlit App to classify swahili news into different categories.

Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

A sample project that exists for PyPUG's "Tutorial on Packaging and Distributing Projects"

Unofficial Python library for using the Polish Wordnet (plWordNet / Słowosieć)

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

xFormers is a modular and field agnostic library to flexibly generate transformer architectures by interoperable and optimized building blocks.

Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

Exploring dimension-reduced embeddings

HuggingTweets - Train a model to generate tweets

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

Source code for the paper "TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations"

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

构建一个多源（公众号、RSS）、干净、个性化的阅读环境

Translate - a PyTorch Language Library

History Aware Multimodal Transformer for Vision-and-Language Navigation

Kerberoast with ACL abuse capabilities

Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to Corpus Exploration

This is my reading list for my PhD in AI, NLP, Deep Learning and more.