Generating Korean Slogans with phonetic and structural repetition

Overview

LexPOS_ko

Generating Korean Slogans with phonetic and structural repetition

Generating Slogans with Linguistic Features

LexPOS is a sequence-to-sequence transformer model that generates slogans with phonetic and structural repetition. For phonetic repetition, it searches for phonetically similar words with user keywords. Both the sound-alike words and user keywords become the lexical constraints while generating slogans. It also adjusts the logits distribution to implement further phonetic constraints. For structural repetition, LexPOS uses POS constraints. Users can specify any repeated phrase structure by POS tags.

Generating slogans with lexical, POS constraints

1. Code

  • Need to download pretrained Korean word2vec model from here and put it below phonetic_similarity/KoG2P
# clone this repo
git clone https://github.com/yeounyi/LexPOS_ko
cd LexPOS
# generate slogans 
python3 generate_slogans.py --keywords 카드,혜택 --num_beams 3 --temperature 1.2
  • -keywords: Keywords that you want to be included in slogans. You can enter multiple keywords, delimited by comma
  • -pos_inputs: You can either specify the particular list of POS tags delimited by comma, or the model will generate slogans with the most frequent syntax used in corpus. POS tags generally follow the format of Konlpy Mecab POS tags.
  • -num_beams: Number of beams for beam search. Default to 1, meaning no beam search.
  • -temperature: The value used to module the next token probabilities. Default to 1.0.
  • -model_path: Path to the pretrained model

2. Examples

Keyword: 카드, 혜택
POS: [NNG, JK, VV, EC, SF, NNG, JK, VV, EF]
Output: 카드를 택하면, 혜택이 바뀐다

Keyword: 안전, 항공
POS: [MM, NNG, SF, MM, NNG, SF]
Output: 새로운 공항, 안전한 항공

Keywords: 추석, 선물
POS: [NNG, JK, MM, NNG, SF, NNG, JK, MM, NNG]
Output: 추석을 앞둔 추억, 당신을 위한 선물

Model Architecture


Pretrained Model

https://drive.google.com/drive/folders/1opkhDApURnjibVYmmhj5bqLTWy4miNe4?usp=sharing

References

https://github.com/scarletcho/KoG2P

Citation

@misc{yi2021lexpos,
  author = {Yi, Yeoun},
  title = {Generating Korean Slogans with Linguistic Constraints using Sequence-to-Sequence Transformer},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/yeounyi/LexPOS_ko}}
}
Owner
Yeoun Yi
Studying Computational Linguistics | Interested in Advertising & Marketing
Yeoun Yi
Use Tensorflow2.7.0 Build OpenAI'GPT-2

TF2_GPT-2 Use Tensorflow2.7.0 Build OpenAI'GPT-2 使用最新tensorflow2.7.0构建openai官方的GPT-2 NLP模型 优点 使用无监督技术 拥有大量词汇量 可实现续写(堪比“xx梦续写”) 实现对话后续将应用于FloatTech的Bot

Watermelon 9 Sep 13, 2022
Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data Authors: Yi-Chang Chen, Yu-Chuan Chang, Yen-Cheng Chang and Yi-Ren Ye

Yi-Chang Chen 5 Dec 15, 2022
Unsupervised intent recognition

INTENT author: steeve LAQUITAINE description: deployment pattern: currently batch only Setup & run git clone https://github.com/slq0/intent.git bash

sl 1 Apr 08, 2022
A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.

Reddit text to speech generator A basic reddit tts video generator Current functionality Generate videos for subs based on comments,(askreddit) so rea

Aadvik 17 Dec 19, 2022
Analyse japanese ebooks using MeCab to determine the difficulty level for japanese learners

japanese-ebook-analysis This aim of this project is to make analysing the contents of a japanese ebook easy and streamline the process for non-technic

Christoffer Aakre 14 Jul 23, 2022
Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework

Graph2Pix: A Graph-Based Image to Image Translation Framework Installation Install the dependencies in env.yml $ conda env create -f env.yml $ conda a

18 Nov 17, 2022
A simple version of DeTR

DeTR-Lite A simple version of DeTR Before you enjoy this DeTR-Lite The purpose of this project is to allow you to learn the basic knowledge of DeTR. P

Jianhua Yang 11 Jun 13, 2022
Task-based datasets, preprocessing, and evaluation for sequence models.

SeqIO: Task-based datasets, preprocessing, and evaluation for sequence models. SeqIO is a library for processing sequential data to be fed into downst

Google 290 Dec 26, 2022
This is a MD5 password/passphrase brute force tool

CROWES-PASS-CRACK-TOOl This is a MD5 password/passphrase brute force tool How to install: Do 'git clone https://github.com/CROW31/CROWES-PASS-CRACK-TO

9 Mar 02, 2022
ETM - R package for Topic Modelling in Embedding Spaces

ETM - R package for Topic Modelling in Embedding Spaces This repository contains an R package called topicmodels.etm which is an implementation of ETM

bnosac 37 Nov 06, 2022
GooAQ 🥑 : Google Answers to Google Questions!

This repository contains the code/data accompanying our recent work on long-form question answering.

AI2 112 Nov 06, 2022
Download videos from YouTube/Twitch/Twitter right in the Windows Explorer, without installing any shady shareware apps

youtube-dl and ffmpeg Windows Explorer Integration Download videos from YouTube/Twitch/Twitter and more (any platform that is supported by youtube-dl)

Wolfgang 226 Dec 30, 2022
Predict an emoji that is associated with a text

Sentiment Analysis Sentiment analysis in computational linguistics is a general term for techniques that quantify sentiment or mood in a text. Can you

Tetsumichi(Telly) Umada 30 Sep 07, 2022
An ActivityWatch watcher to pose questions to the user and record her answers.

aw-watcher-ask An ActivityWatch watcher to pose questions to the user and record her answers. This watcher uses Zenity to present dialog boxes to the

Bernardo Chrispim Baron 33 Dec 03, 2022
CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

CCF BDCI 2020 房产行业聊天问答匹配 A榜47/2985 赛题描述详见:https://www.datafountain.cn/competitions/474 文件说明 data: 存放训练数据和测试数据以及预处理代码 model_bert.py: 网络模型结构定义 adv_train

shuo 40 Sep 28, 2022
Mastering Transformers, published by Packt

Mastering Transformers This is the code repository for Mastering Transformers, published by Packt. Build state-of-the-art models from scratch with adv

Packt 195 Jan 01, 2023
Use the power of GPT3 to execute any function inside your programs just by giving some doctests

gptrun Don't feel like coding today? Use the power of GPT3 to execute any function inside your programs just by giving some doctests. How is this diff

Roberto Abdelkader Martínez Pérez 11 Nov 11, 2022
jiant is an NLP toolkit

jiant is an NLP toolkit The multitask and transfer learning toolkit for natural language processing research Why should I use jiant? jiant supports mu

ML² AT CILVR 1.5k Jan 04, 2023
A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

Robert Bogan Kang 3 May 25, 2022
Google and Stanford University released a new pre-trained model called ELECTRA

Google and Stanford University released a new pre-trained model called ELECTRA, which has a much compact model size and relatively competitive performance compared to BERT and its variants. For furth

Yiming Cui 1.2k Dec 30, 2022