Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

Last update: Dec 17, 2022

Overview

SongNet

SongNet: SongCi + Song (Lyrics) + Sonnet + etc.

@inproceedings{li-etal-2020-rigid,
    title = "Rigid Formats Controlled Text Generation",
    author = "Li, Piji and Zhang, Haisong and Liu, Xiaojiang and Shi, Shuming",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.68",
    doi = "10.18653/v1/2020.acl-main.68",
    pages = "742--751"
}

Run

python prepare_data.py
./train.sh

Evaluation

Modify test.py: m_path = the best dev model
./test.sh
python metrics.py

Polish

./polish.sh

Download

The pretrained Chinese Language Model: https://drive.google.com/file/d/1g2tGyUwPe86vPn2nub1vkQva5lwtZ6Rd/view
The finetuned SongCi model: https://drive.google.com/file/d/16A2AzuU7slf7xj2QdLcBAorUCCaCk650/view

Reference

Guyu: https://github.com/lipiji/Guyu
Pretraining：https://github.com/lipiji/big_tpl_zh_10_base

Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

Related tags

Overview

SongNet

Run

Evaluation

Polish

Download

Reference

Owner

Piji Li

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Implementation of Multistream Transformers in Pytorch

Dust model dichotomous performance analysis

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

中文空间语义理解评测

PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP. Democratize AI for everyone.

Training open neural machine translation models

A Fast Command Analyser based on Dict and Pydantic

Japanese Long-Unit-Word Tokenizer with RemBertTokenizerFast of Transformers

A Flask Sentiment Analysis API, with visual implementation

Simple virtual assistant using pyttsx3 and speech recognition optionally with pywhatkit and pther libraries.

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

A Python script which randomly chooses and prints a file from a directory.

Neural network sequence labeling model

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.