DCL - An easy to use diacritic library used for diacritic and accent manipulation.

Related tags

Audiopython-dcl
Overview

Diacritics Library

Code Style: Black Imports: isort PRs welcome

This library is used for adding, and removing diacritics from strings.

Getting started

Start by importing the module:

import dcl

DCL currently supports a multitude of diacritics:

  • acute
  • breve
  • caron
  • cedilla
  • grave
  • interpunct
  • macron
  • ogonek
  • ring
  • ring_and_acute
  • slash
  • stroke
  • stroke_and_acute
  • tilde
  • tittle
  • umlaut/diaresis
  • umlaut_and_macron

Each accent has their own attribute which is directly accessible from the dcl module.

dcl.acute('a')
>>> 'á'

These attributes return a Character object, which is essentially just a handy "wrapper" around our diacritic, which we can use to access various attributes to retrieve further information about the diacritic we're focusing on.

" char.character # the same as str(char) >>> 'ą' char.diacritic # some return >>> '˛' char.diacritic_name >>> 'ogonek' char.raw # returns the raw representation of our character >>> '\U00000105' char.raw_diacritic >>> '\U000002db' ">
char = dcl.ogonek('a')

repr(char)
>>> ""

char.character  # the same as str(char)
>>> 'ą'

char.diacritic  # some return 
>>> '˛'

char.diacritic_name
>>> 'ogonek'

char.raw  # returns the raw representation of our character
>>> '\U00000105'

char.raw_diacritic 
>>> '\U000002db'

Some functions can't take certain letters. For example, the letter h cannot take a cedilla diacritic. In this case, an exception is raised named DiacriticError. You can access this exception via dcl.errors.DiacriticError.

from dcl.errors import DiacriticError

try:
    char = dcl.cedilla('h')
except DiacriticError as e:
    print(e)
else:
    print(repr(char))

>>> 'Character h cannot take a cedilla diacritic'

If you want to, you may also use the DiacriticApplicant object from dcl.objects. The functions you see above use this object too, and it's virtually the same principle, except from the fact that we use properties to get the diacritic, and the class simply holds the string and it's properties. Alas with the functions above, this object also returns the same Character object through it's properties.

" ">
from dcl.objects import DiacriticApplicant

da = DiacriticApplicant('a')
repr(da.ogonek)
>>> ""

There is also the clean_diacritics function, accessible straight from the dcl module. This function allows us to completely clean a string from any diacritics.

>> 'Kreusada' dcl.clean_diacritics("Café") >>> 'Cafe' ">
dcl.clean_diacritics("Krëûšàdå")
>>> 'Kreusada'

dcl.clean_diacritics("Café")
>>> 'Cafe'

Along with this function, there's also count_diacritics, get_diacritics and has_diacritics.

The has_diacritics function simply checks if the string contains a character with a diacritic.

>> True dcl.has_diacritics("dcl") >>> False ">
dcl.has_diacritics("Café")
>>> True

dcl.has_diacritics("dcl")
>>> False

The get_diacritics function is used to get all the diacritics in a string. It returns a dictionary. For each diacritic in the string, the key will show the diacritic's index in the string, and the value will show the Character representation.

>> {3: } dcl.get_diacritics("Krëûšàdå") >>> {2: , 3: , 4: , 5: , 7: } ">
dcl.get_diacritics("Café")
>>> {3: <acute 'é'>}

dcl.get_diacritics("Krëûšàdå")
>>> {2: <umlaut 'ë'>, 3: <circumflex 'û'>, 4: <caron 'š'>, 5: <grave 'à'>, 7: <ring 'å'>}

The count_diacritics function counts the number of diacritics in a string. The actual implementation of this simply returns the dictionary length from get_diacritics.

>> 1 ">
dcl.count_diacritics("Café")
>>> 1

Creating an end user program

Creating a program would be pretty simple for this, and I'd love to be able to help you out with a base idea. Have a look at this for example:

import dcl
import string

from dcl.errors import DiacriticError

char = str(input("Enter a character: "))
if not char in string.ascii_letters:
    print("Please enter a letter from a-Z.")
else:
    accent = str(input("Enter an accent, you can choose from the following: " + ", ".join(dcl.diacritic_list)))
    if not dcl.isdiacritictype(accent):
        print("That was not a valid accent.")
    else:
        try:
            function = getattr(dcl, accent)  # or dcl.objects.DiacriticApplicant
            output = function(char)
        except DiacriticError as e:
            print(e)
        else:
            print(str(output))

It's worth checking if the provided accent is a diacritic type. If it is, then you can use getattr. Without checking, the user could provide a default global such as __file__.

You can also create a program which can remove diacritics from a string. It's made easy!

import dcl

string = str(input("Enter the string which you want to be cleared from diacritics: "))
print("Here is your cleaned string: " + dcl.clean_diacritics(string))

Or perhaps your program wants to count the number of diacritics contained within your string.

import dcl

string = str(input("This program will count the number of diacritics contained in your input. Enter a string: "))
count = dcl.count_diacritics(string)
if count == 1:
    grammar = "is"
else:
    grammar = "are"
print(f"There {grammar} {count} diacritics/accent in your string.")
Owner
Kreus Amredes
Python developer, contributor and maintainer. 🐍
Kreus Amredes
A simple python script to play bell sound in your system infinitely, just for fun and experimental purposes

A simple python script to play bell sound in your system infinitely, just for fun and experimental purposes

نافع الهلالي 1 Oct 29, 2021
Simple, hackable offline speech to text - using the VOSK-API.

Nerd Dictation Offline Speech to Text for Desktop Linux. This is a utility that provides simple access speech to text for using in Linux without being

Campbell Barton 844 Jan 07, 2023
spafe: Simplified Python Audio-Features Extraction

spafe aims to simplify features extractions from mono audio files. The library can extract of the following features: BFCC, LFCC, LPC, LPCC, MFCC, IMFCC, MSRCC, NGCC, PNCC, PSRCC, PLP, RPLP, Frequenc

Ayoub Malek 310 Jan 01, 2023
Some utils for auto speech recognition

About Some utils for auto speech recognition. Utils Util Description Script Reset audio Reset sample rate, sample width, etc of audios.

1 Jan 24, 2022
Conferencing Speech Challenge

ConferencingSpeech 2021 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech challenge. For more detai

73 Nov 29, 2022
Spotify Song Recommendation Program

Spotify-Song-Recommendation-Program Made by Esra Nur Özüm Written in Python The aim of this project was to build a recommendation system that recommen

esra nur özüm 1 Jun 30, 2022
Real-time audio visualizations (spectrum, spectrogram, etc.)

Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco

Timothée Lecomte 700 Dec 31, 2022
Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

Open-Source Tools & Data for Music Source Separation: A Pragmatic Guide for the MIR Practitioner

IELab@ Korea University 0 Nov 12, 2021
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 84 Dec 31, 2022
A Youtube audio player for your terminal

AudioLine A lightweight Youtube audio player for your terminal Explore the docs » View Demo · Report Bug · Request Feature · Send a Pull Request About

Haseeb Khalid 26 Jan 04, 2023
Gateware for the Terasic/Arrow DECA board, to become a USB2 high speed audio interface

DECA USB Audio Interface DECA based USB 2.0 High Speed audio interface Status / current limitations enumerates as class compliant audio device on Linu

Hans Baier 16 Mar 21, 2022
Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose We provide PyTorch implementations for our arxiv paper "Audio-dr

Ran Yi 497 Jan 09, 2023
This is a python package that turns any images into MIDI files that views the same as them

image_to_midi This is a python package that turns any images into MIDI files that views the same as them. This package firstly convert the image to AS

Rainbow Dreamer 4 Mar 10, 2022
This library provides common speech features for ASR including MFCCs and filterbank energies.

python_speech_features This library provides common speech features for ASR including MFCCs and filterbank energies. If you are not sure what MFCCs ar

James Lyons 2.2k Jan 04, 2023
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats

SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Note Neither this, or PyTgCalls are fully

SU Projects 58 Jan 02, 2023
Implicit neural differentiable FM synthesizer

Implicit neural differentiable FM synthesizer The purpose of this project is to emulate arbitrary sounds with FM synthesis, where the parameters of th

Andreas Jansson 34 Nov 06, 2022
AudioDVP:Photorealistic Audio-driven Video Portraits

AudioDVP This is the official implementation of Photorealistic Audio-driven Video Portraits. Major Requirements Ubuntu = 18.04 PyTorch = 1.2 GCC =

232 Jan 03, 2023
This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

OneBit 1 Nov 05, 2021
Cobra is a highly-accurate and lightweight voice activity detection (VAD) engine.

On-device voice activity detection (VAD) powered by deep learning.

Picovoice 88 Dec 16, 2022
Mentos Music Bot With Python

Mentos Music Bot For Any Query Join Our Support Group 👥 Special Thanks - @OfficialYukki Hey Welcome To Here 💫 💫 You Can Make Your Own Music Bot Fo

Cyber Toxic 13 Oct 21, 2022