Generating Korean Slogans with phonetic and structural repetition

Last update: May 23, 2022

Related tags

Overview

LexPOS_ko

Generating Korean Slogans with phonetic and structural repetition

Generating Slogans with Linguistic Features

LexPOS is a sequence-to-sequence transformer model that generates slogans with phonetic and structural repetition. For phonetic repetition, it searches for phonetically similar words with user keywords. Both the sound-alike words and user keywords become the lexical constraints while generating slogans. It also adjusts the logits distribution to implement further phonetic constraints. For structural repetition, LexPOS uses POS constraints. Users can specify any repeated phrase structure by POS tags.

Generating slogans with lexical, POS constraints

1. Code

Need to download pretrained Korean word2vec model from here and put it below phonetic_similarity/KoG2P

# clone this repo
git clone https://github.com/yeounyi/LexPOS_ko
cd LexPOS
# generate slogans 
python3 generate_slogans.py --keywords 카드,혜택 --num_beams 3 --temperature 1.2

-keywords: Keywords that you want to be included in slogans. You can enter multiple keywords, delimited by comma
-pos_inputs: You can either specify the particular list of POS tags delimited by comma, or the model will generate slogans with the most frequent syntax used in corpus. POS tags generally follow the format of Konlpy Mecab POS tags.
-num_beams: Number of beams for beam search. Default to 1, meaning no beam search.
-temperature: The value used to module the next token probabilities. Default to 1.0.
-model_path: Path to the pretrained model

2. Examples

Keyword: 카드, 혜택
POS: [NNG, JK, VV, EC, SF, NNG, JK, VV, EF]
Output: 카드를 택하면, 혜택이 바뀐다

Keyword: 안전, 항공
POS: [MM, NNG, SF, MM, NNG, SF]
Output: 새로운 공항, 안전한 항공

Keywords: 추석, 선물
POS: [NNG, JK, MM, NNG, SF, NNG, JK, MM, NNG]
Output: 추석을 앞둔 추억, 당신을 위한 선물

Model Architecture

Pretrained Model

https://drive.google.com/drive/folders/1opkhDApURnjibVYmmhj5bqLTWy4miNe4?usp=sharing

References

https://github.com/scarletcho/KoG2P

Citation

@misc{yi2021lexpos,
  author = {Yi, Yeoun},
  title = {Generating Korean Slogans with Linguistic Constraints using Sequence-to-Sequence Transformer},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/yeounyi/LexPOS_ko}}
}

Generating Korean Slogans with phonetic and structural repetition

Related tags

Overview

LexPOS_ko

Generating Slogans with Linguistic Features

Generating slogans with lexical, POS constraints

1. Code

2. Examples

Model Architecture

Pretrained Model

References

Citation

Owner

Yeoun Yi

A sample project that exists for PyPUG's "Tutorial on Packaging and Distributing Projects"

Wake: Context-Sensitive Automatic Keyword Extraction Using Word2vec

ConvBERT-Prod

Ecco is a python library for exploring and explaining Natural Language Processing models using interactive visualizations.

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.

Pattern Matching in Python

Anuvada: Interpretable Models for NLP using PyTorch

To create a deep learning model which can explain the content of an image in the form of speech through caption generation with attention mechanism on Flickr8K dataset.

auto_code_complete is a auto word-completetion program which allows you to customize it on your need

Perform sentiment analysis on textual data that people generally post on websites like social networks and movie review sites.

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model

Python bot created with Selenium that can guess the daily Wordle word correct 96.8% of the time.

NLPShala , the best IDE for all Natural language processing tasks.

Use the state-of-the-art m2m100 to translate large data on CPU/GPU/TPU. Super Easy!

Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch

Opal-lang - A WIP programming language based on Python

This repository has a implementations of data augmentation for NLP for Japanese.

lightweight, fast and robust columnar dataframe for data analytics with online update