Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Last update: Dec 29, 2022

Related tags

Text Data & NLP python-zhuyin

Overview

Python-Zhuyin (pyzhuyin) 注音和拼音轉換

Introduction 介紹

pyzhuyin is an open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo).

pyzhuyin 是一個開放原始碼的 Python 套件，提供了將拼音轉換成注音的統一介面。

Installation 安裝

pip install pyzhuyin

Usage 使用

from pyzhuyin import pinyin_to_zhuyin, zhuyin_to_pinyin


assert(pinyin_to_zhuyin("lu3") == "ㄌㄨˇ")
assert(pinyin_to_zhuyin("dan4") == "ㄉㄢˋ")
assert(map(pinyin_to_zhuyin, ["lu3", "dan4"]) == ["ㄌㄨˇ", "ㄉㄢˋ"])

assert(zhuyin_to_pinyin("ㄌㄩˊ") == "lü2")
assert(zhuyin_to_pinyin("˙ㄗ") == "zi5")
assert(map(lambda z: zhuyin_to_pinyin(z, u_to_v=True), ["ㄌㄩˊ", "˙ㄗ"]) == ["lv2", "zi5"])

Testing 測試

Run the following command at the root of the project to test the library:

在根目錄執行以下指令以測試套件:

python3 -m unittest

Notes 備註

Only support numeric tone for pinyin
- e.g. "lu3" instead of "lǔ"
Neutral tone is represented as 5
- e.g. "˙ㄗ" -> "zi5"
For pinyin_to_zhuyin:
- if corresponding zhuyin not found, raise ValueError
- internally convert all v to ü
For zhuyin_to_pinyin:
- if corresponding pinyin not found, raise ValueError
兒化音 is not supported because it is not representable in the zhuyin system as a "combo" word
- e.g. "公園兒" -> "gong1 yuanr2" -> "ㄍㄨㄥㄩㄢㄦˊ" (not allowed)

Data Sources 資料來源

中華民國教育部（Ministry of Education, R.O.C.）。《重編國語辭典修訂本》（版本編號：2015_20210928 ）

網址：https://dict.revised.moe.edu.tw/

CC BY-ND 3.0 TW 授權

Author 作者

Raymond Ku

Python-zhuyin - An open source Python library that provides a unified interface for converting between Chinese pinyin and Zhuyin (bopomofo)

Related tags

Overview

Python-Zhuyin (pyzhuyin) 注音和拼音轉換

Introduction 介紹

Installation 安裝

Usage 使用

Testing 測試

Notes 備註

Data Sources 資料來源

Author 作者

Owner

Retraining OpenAI's GPT-2 on Discord Chats

PyTorch implementation and pretrained models for XCiT models. See XCiT: Cross-Covariance Image Transformer

An ActivityWatch watcher to pose questions to the user and record her answers.

EdiTTS: Score-based Editing for Controllable Text-to-Speech

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

Backend for the Autocomplete platform. An AI assisted coding platform.

A framework for implementing federated learning

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Nmt - TensorFlow Neural Machine Translation Tutorial

[AAAI 21] Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

Yet Another Compiler Visualizer

Pipelines de datos, 2021.

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

Training and evaluation codes for the BertGen paper (ACL-IJCNLP 2021)

Library for Russian imprecise rhymes generation

Transformer related optimization, including BERT, GPT

An algorithm that can solve the word puzzle Wordle with an optimal number of guesses on HARD mode.

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

DataCLUE: 国内首个以数据为中心的AI测评（含模型分析报告）