Source code for AAAI20 "Generating Persona Consistent Dialogues by Exploiting Natural Language Inference".

Overview

Generating Persona Consistent Dialogues by Exploiting Natural Language Inference

Source code for RCDG model in AAAI20 Generating Persona Consistent Dialogues by Exploiting Natural Language Inference, a natural language inference (NLI) enhanced reinforcement learning dialogue model.

Requirements:

The code is tested under the following env:

  • Python 3.6
  • Pytorch 0.3.1

Install with conda: conda install pytorch==0.3.1 torchvision cudatoolkit=7.5 -c pytorch

This released code has been tested on a Titan-XP 12G GPU.

Data

We have provided some data samples in ./data to show the format. For downloading the full datasets, please refer to the following papers:

How to Run:

For a easier way to run the code, here the NLI model is GRU+MLP, i.e. RCDG_base, and we remove the time-consuming MC search.

Here are a few steps to run this code:

0. Prepare Data

python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -train_per data/per-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -valid_per data/per-val.txt -train_nli data/nli-train.txt -valid_nli data/nli-valid.txt -save_data data/nli_persona -src_vocab_size 18300 -tgt_vocab_size 18300 -share_vocab

And as introduced in the paper, there are different training stages:

1. NLI model Pretrain

cd NLI_pretrain/

python train.py -data ../data/nli_persona -batch_size 32 -save_model saved_model/consistent_dialogue -rnn_size 500 -word_vec_size 300 -dropout 0.2 -epochs 5 -learning_rate_decay 1 -gpu 0

And you should see something like:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 1
31432
Epoch  1, nli_step     1/ 4108; nli: 0.28125
Epoch  1, nli_step    11/ 4108; nli: 0.38125
Epoch  1, nli_step    21/ 4108; nli: 0.43438
Epoch  1, nli_step    31/ 4108; nli: 0.48125
Epoch  1, nli_step    41/ 4108; nli: 0.53750
Epoch  1, nli_step    51/ 4108; nli: 0.56250
Epoch  1, nli_step    61/ 4108; nli: 0.49062
...

2. Generator G Pretrain

cd ../G_pretrain/

python train.py -data ../data/nli_persona -batch_size 32 -rnn_size 500 -word_vec_size 300  -dropout 0.2 -epochs 15 -g_optim adam -g_learning_rate 1e-3 -learning_rate_decay 1 -train_from PATH_TO_PRETRAINED_NLI -gpu 0

Here the PATH_TO_PRETRAINED_NLI should be replaced by your model path, e.g., ../NLI_pretrain/saved_model/consistent_dialogue_e3.pt.

If , you should see the ppl comes down during training, which means the dialogue model is in training:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  4, teacher_force     1/ 4108; acc:   0.00; ppl: 18619.76; 125 src tok/s; 162 tgt tok/s;      3 s elapsed
Epoch  4, teacher_force    11/ 4108; acc:   9.69; ppl: 2816.01; 4159 src tok/s; 5468 tgt tok/s;      3 s elapsed
Epoch  4, teacher_force    21/ 4108; acc:   9.78; ppl: 550.46; 5532 src tok/s; 6116 tgt tok/s;      4 s elapsed
Epoch  4, teacher_force    31/ 4108; acc:  11.15; ppl: 383.06; 5810 src tok/s; 6263 tgt tok/s;      5 s elapsed
...
Epoch  4, teacher_force   941/ 4108; acc:  25.40; ppl:  90.18; 5993 src tok/s; 6645 tgt tok/s;     63 s elapsed
Epoch  4, teacher_force   951/ 4108; acc:  27.49; ppl:  77.07; 5861 src tok/s; 6479 tgt tok/s;     64 s elapsed
Epoch  4, teacher_force   961/ 4108; acc:  26.24; ppl:  83.17; 5473 src tok/s; 6443 tgt tok/s;     64 s elapsed
Epoch  4, teacher_force   971/ 4108; acc:  24.33; ppl:  97.14; 5614 src tok/s; 6685 tgt tok/s;     65 s elapsed
...

3. Discriminator D Pretrain

cd ../D_pretrain/

python train.py -epochs 20 -d_optim adam -d_learning_rate 1e-4 -data ../data/nli_persona -train_from PATH_TO_PRETRAINED_G -batch_size 32 -learning_rate_decay 0.99 -gpu 0

Similarly, replace PATH_TO_PRETRAINED_G with the G Pretrain model path.

The acc of D will be displayed during training:

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  5, d_step     1/ 4108; d: 0.49587
Epoch  5, d_step    11/ 4108; d: 0.51580
Epoch  5, d_step    21/ 4108; d: 0.49853
Epoch  5, d_step    31/ 4108; d: 0.55248
Epoch  5, d_step    41/ 4108; d: 0.55168
...

4. Reinforcement Training

cd ../reinforcement_train/

python train.py -epochs 30 -batch_size 32 -d_learning_rate 1e-4 -g_learning_rate 1e-4 -learning_rate_decay 0.9 -data ../data/nli_persona -train_from PATH_TO_PRETRAINED_D -gpu 0

Remember to replace PATH_TO_PRETRAINED_D with the D Pretrain model path.

Note that all the -epochs are global among all stages, if you want to tune this parameter. Actually, there are 30 - 20 = 10 training epochs in this Reinforcement Training stage if the D Pretrain model was trained 20 epochs in total.

Loading train dataset from ../data/nli_persona.train.1.pt, number of examples: 131432
Epoch  7, self_sample     1/ 4108; acc:   2.12; ppl:   0.28; 298 src tok/s; 234 tgt tok/s;      2 s elapsed
Epoch  7, teacher_force    11/ 4108; acc:   3.32; ppl:   0.53; 2519 src tok/s; 2772 tgt tok/s;      3 s elapsed
Epoch  7, d_step    21/ 4108; d: 0.98896
Epoch  7, d_step    31/ 4108; d: 0.99906
Epoch  7, self_sample    41/ 4108; acc:   0.00; ppl:   0.27; 1769 src tok/s; 260 tgt tok/s;      7 s elapsed
Epoch  7, teacher_force    51/ 4108; acc:   2.83; ppl:   0.43; 2368 src tok/s; 2910 tgt tok/s;      9 s elapsed
Epoch  7, d_step    61/ 4108; d: 0.75311
Epoch  7, d_step    71/ 4108; d: 0.83919
Epoch  7, self_sample    81/ 4108; acc:   6.20; ppl:   0.33; 1791 src tok/s; 232 tgt tok/s;     12 s elapsed
...

5. Testing Trained Model

Now we have a trained dialogue model, we can test by:

Still in ./reinforcement_train/

python predict.py -model TRAINED_MODEL_PATH  -src ../data/src-val.txt -tgt ../data/tgt-val.txt -replace_unk -verbose -output ./results.txt -per ../data/per-val.txt -nli nli-val.txt -gpu 0

MISC

  • Initializing Model Seems Slow?

    This is a legacy problem due to pytorch < 0.4, not brought by this project. And the training efficiency will not be affected.

  • BibTex

     @article{Song_RCDG_2020,
     	title={Generating Persona Consistent Dialogues by Exploiting Natural Language Inference},
     	volume={34},
     	DOI={10.1609/aaai.v34i05.6417},
     	number={05},
     	journal={Proceedings of the AAAI Conference on Artificial Intelligence},
     	author={Song, Haoyu and Zhang, Wei-Nan and Hu, Jingwen and Liu, Ting},
     	year={2020},
     	month={Apr.},
     	pages={8878-8885}
     	}
    
A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

List Of English Words A text file containing over 466k English words. While searching for a list of english words (for an auto-complete tutorial) I fo

dwyl 8.5k Jan 03, 2023
Natural Language Processing library built with AllenNLP 🌲🌱

Custom Natural Language Processing with big and small models 🌲🌱

Recognai 65 Sep 13, 2022
ETM - R package for Topic Modelling in Embedding Spaces

ETM - R package for Topic Modelling in Embedding Spaces This repository contains an R package called topicmodels.etm which is an implementation of ETM

bnosac 37 Nov 06, 2022
2021 2학기 데이터크롤링 기말프로젝트

공지 주제 웹 크롤링을 이용한 취업 공고 스케줄러 스케줄 주제 정하기 코딩하기 핵심 코드 설명 + 피피티 구조 구상 // 12/4 토 피피티 + 스크립트(대본) 제작 + 녹화 // ~ 12/10 ~ 12/11 금~토 영상 편집 // ~12/11 토 웹크롤러 사람인_평균

Choi Eun Jeong 2 Aug 16, 2022
An ActivityWatch watcher to pose questions to the user and record her answers.

aw-watcher-ask An ActivityWatch watcher to pose questions to the user and record her answers. This watcher uses Zenity to present dialog boxes to the

Bernardo Chrispim Baron 33 Dec 03, 2022
DziriBERT: a Pre-trained Language Model for the Algerian Dialect

DziriBERT is the first Transformer-based Language Model that has been pre-trained specifically for the Algerian Dialect.

117 Jan 07, 2023
A Paper List for Speech Translation

Keyword: Speech Translation, Spoken Language Processing, Natural Language Processing

138 Dec 24, 2022
本插件是pcrjjc插件的重置版,可以独立于后端api运行

pcrjjc2 本插件是pcrjjc重置版,不需要使用其他后端api,但是需要自行配置客户端 本项目基于AGPL v3协议开源,由于项目特殊性,禁止基于本项目的任何商业行为 配置方法 环境需求:.net framework 4.5及以上 jre8 别忘了装jre8 别忘了装jre8 别忘了装jre8

132 Dec 26, 2022
The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

Zhiyu Chen 114 Dec 29, 2022
Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Accelerated Sparse Neural Training: A Provable and Efficient Method to FindN:M Transposable Masks Recently, researchers proposed pruning deep neural n

itay hubara 4 Feb 23, 2022
Example code for "Real-World Natural Language Processing"

Real-World Natural Language Processing This repository contains example code for the book "Real-World Natural Language Processing." AllenNLP (2.5.0 or

Masato Hagiwara 303 Dec 17, 2022
Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

BiQQLSTM_HS Code and data for paper: Title: An Effective, Robust and Fairness-awareHate Speech Detection Framework. Authors: Guanyi Mou and Kyumin Lee

Guanyi Mou 2 Dec 27, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.8k Dec 30, 2022
Dust model dichotomous performance analysis

Dust-model-dichotomous-performance-analysis Using a collated dataset of 90,000 dust point source observations from 9 drylands studies from around the

1 Dec 17, 2021
This repository has a implementations of data augmentation for NLP for Japanese.

daaja This repository has a implementations of data augmentation for NLP for Japanese: EDA: Easy Data Augmentation Techniques for Boosting Performance

Koga Kobayashi 60 Nov 11, 2022
Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

TRICE: a task-agnostic transferring framework for multi-source sequence generation This is the source code of our work Transfer Learning for Sequence

THUNLP-MT 9 Jun 27, 2022
Repositório da disciplina no semestre 2021-2

Avisos! Nenhum aviso! Compiladores 1 Este é o Git da disciplina Compiladores 1. Aqui ficará o material produzido em sala de aula assim como tarefas, w

6 May 13, 2022
Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

0 Feb 13, 2022
Generate text line images for training deep learning OCR model (e.g. CRNN)

Generate text line images for training deep learning OCR model (e.g. CRNN)

532 Jan 06, 2023
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

CTC Decoding Algorithms Update 2021: installable Python package Python implementation of some common Connectionist Temporal Classification (CTC) decod

Harald Scheidl 736 Jan 03, 2023