Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

Related tags

Deep Learningaccentor
Overview

ACCENTOR: Adding Chit-Chat to Enhance Task-Oriented Dialogues

Overview

ACCENTOR consists of the human-annotated chit-chat additions to the 23.8K dialogues from Schema Guided Dialogue (SGD) and MultiWOZ 2.1, allowing researchers to study contexutal addition of chit-chat utterances for virtual assistants, to make task-oriented dialogues more engaging and social.

We also provide three new models for ACCENTOR explicitly trained to predict user goals and to generate contextually relevant chit-chat responses.

Automatic and human evaluations show that, compared with the state of-the-art task-oriented baseline, our models can code-switch between task and chit-chat to be more engaging, interesting, knowledgeable, and humanlike, while maintaining competitive task performance.

For more details, please refer to this paper.

Data

  • v1.0/candidates-{sgd,multiwoz}.json: Annotated chit-chat candidates. The format is as follows.
{
 "dialogue 1 / id": [
  [
   dialogue 1 / candidate 1 / turn id,
   dialogue 1 / candidate 1 / position,
   dialogue 1 / candidate 1 / candidate,
   dialogue 1 / candidate 1 / label,
   dialogue 1 / candidate 1 / justification
  ],
  [
   dialogue 1 / candidate 2 / turn id,
   ...
  ],
  ...
 ],
 "dialogue 2 / id": [
  ...
 ],
 ...
}
  • Folder v1.0/accentor-sgd: The augmented SGD dataset. The format follows the original SGD dataset, with two additional keys (i.e., beginning and end) that store lists of (candidate, label, justification) tuples.

    • The folder is generated by v1.0/accentor-sgd.py (with v1.0/candidates-sgd.json and the original SGD dataset as input). Usage: python3 v1.0/accentor-sgd.py --help.
  • v1.0/accentor-multiwoz-1k.json: 1K augmented MultiWOZ 2.1 dialogues. The format follows the original MultiWOZ dataset, with two additional keys (i.e., beginning and end) that store lists of (candidate, label, justification) tuples.

    • The file is generated by v1.0/accentor-multiwoz.py (with v1.0/candidates-multiwoz.json and the original MultiWOZ 2.1 dataset as input). Usage: python3 v1.0/accentor-multiwoz.py --help.

Baseline Models

Preparation

  • Dependencies: ParlAI (af12799a) and Transformers (2.11.0)

  • Run the following commands to prepare the data for model training and the off-the-shelf models (i.e., a task-oriented dialogue model and a chit-chat model) for Arranger and Rewriter.

cp -r ./v1.0/accentor-sgd .

python3 gen_delex.py

python3 gen_parlai_data.py

parlai train_model -t fromfile:parlaiformat --fromfile_datapath ./parlai --fromfile-datatype-extension true  -m transformer/generator --init-model zoo:tutorial_transformer_generator/model --dict-file zoo:tutorial_transformer_generator/model.dict --embedding-size 512 --n-layers 8 --ffn-size 2048 --dropout 0.1 --n-heads 16 --learn-positional-embeddings True --n-positions 512 --variant xlm --activation gelu --skip-generation True --fp16 True --text-truncate 512 --label-truncate 128 --dict-tokenizer bpe --dict-lower True -lr 1e-06 --optimizer adamax --lr-scheduler reduceonplateau --gradient-clip 0.1 -veps 0.25 --betas 0.9,0.999 --update-freq 1 --attention-dropout 0.0 --relu-dropout 0.0 --skip-generation True -vp 15 -stim 60 -vme 20000 -bs 16 -vmt ppl -vmm min --save-after-valid True --model-file ./train_90M

parlai interactive -mf ./train_90M < lm.input.dev.cc.txt > lm.output.dev.cc.txt

parlai interactive -mf ./train_90M < lm.input.test.cc.txt > lm.output.test.cc.txt

python3 run_language_modeling.py --output_dir=output_gpt2_10epoch_1e-3_fp16 --model_type=gpt2 --model_name_or_path=gpt2 --do_train --train_data_file=lm.input.train.txt --do_eval  --eval_data_file=lm.input.dev.txt --per_device_train_batch_size 2 --gradient_accumulation_steps 18 --num_train_epochs 10 --learning_rate 1e-3 --fp16 --overwrite_output_dir

python3 run_generation.py --input lm.input.dev.eval.txt --output dev.inference.gpt2_10epoch_1e-3_fp16.json --model_name_or_path ./output_gpt2_10epoch_1e-3_fp16 --eos_token_id 50262

python3 run_generation.py --input lm.input.test.eval.txt --output test.inference.gpt2_10epoch_1e-3_fp16.json --model_name_or_path ./output_gpt2_10epoch_1e-3_fp16 --eos_token_id 50262

SimpleTOD+

  • Dependency: Transformers (2.11.0)
python3 run_language_modeling.py --output_dir=output_both_gpt2_10epoch_1e-3_fp16 --model_type=gpt2 --model_name_or_path=gpt2 --do_train --train_data_file=lm.input.train.both.txt --do_eval  --eval_data_file=lm.input.dev.both.txt --per_device_train_batch_size 2 --gradient_accumulation_steps 18 --num_train_epochs 10 --learning_rate 1e-3 --fp16 --overwrite_output_dir

python3 run_generation.py --input lm.input.dev.eval.txt --output dev.inference.both_gpt2_10epoch_1e-3_fp16.json --model_name_or_path ./output_both_gpt2_10epoch_1e-3_fp16 --eos_token_id 50262

python3 run_generation.py --input lm.input.test.eval.txt --output test.inference.both_gpt2_10epoch_1e-3_fp16.json --model_name_or_path ./output_both_gpt2_10epoch_1e-3_fp16 --eos_token_id 50262

Arranger

  • Dependency: Transformers (2.2.0)
python3 gen_arranger_input.py

python3 run_multiple_choice.py --model_type roberta --task_name acc --model_name_or_path roberta-base --do_train --do_eval --do_test --do_lower_case --data_dir . --learning_rate 2e-5 --num_train_epochs 3 --max_seq_length 512 --output_dir acc_arranger_roberta_base_3epoch --per_gpu_eval_batch_size=16 --per_gpu_train_batch_size=1 --gradient_accumulation_steps 24 --overwrite_output --save_steps 10000

python3 gen_arranger_output.py

Rewriter

  • Dependency: Transformers 2.11.0
python3 gen_rewriter_data.py

python3 run_language_modeling.py --output_dir=output_ff_gpt2_10epoch_1e-3_fp16 --model_type=gpt2 --model_name_or_path=gpt2 --do_train --train_data_file=lm.input.train.ff.txt  --do_eval --eval_data_file=lm.input.dev.ff.txt --per_device_train_batch_size 2 --gradient_accumulation_steps 18 --num_train_epochs 10 --learning_rate 1e-3 --fp16 --overwrite_output_dir

python3 run_generation.py --input lm.input.dev.eval.ff.txt --output dev.inference.ff_gpt2_10epoch_1e-3_fp16.json --model_name_or_path ./output_ff_gpt2_10epoch_1e-3_fp16 --eos_token_id 50262

python3 run_generation.py --input lm.input.test.eval.ff.txt --output test.inference.ff_gpt2_10epoch_1e-3_fp16.json --model_name_or_path ./output_ff_gpt2_10epoch_1e-3_fp16 --eos_token_id 50262

Evaluation

  • Dependency: the official evaluation script of SGD

  • Pass the output inference files (i.e., {dev,test}.inference*.json) to gen_predict.py to obtain act-slot F1 and BLEU-4 scores. For example,

python3 gen_predict.py --inference test.inference.both_gpt2_10epoch_1e-3_fp16.json --split test
  • The above command will also generate a folder (named ./prediction/ by default), which can be passed to the official evaluation script of SGD to obtain the joint goal accuracy and average accuracy. For example,
python3 -m schema_guided_dst.evaluate --dstc8_data_dir ./simpletod/ --prediction_dir ./prediction/test/ --eval_set test --output_metric_file simpletod+_test_result.json

Citations

If you want to publish experimental results with our datasets or use the baseline models, please cite the following article (pdf):

@inproceedings{sun2020adding,
  title={Adding Chit-Chat to Enhance Task-Oriented Dialogues},
  author={Sun, Kai and Moon, Seungwhan and Crook, Paul and Roller, Stephen and Silvert, Becka and Liu, Bing and Wang, Zhiguang and Liu, Honglei and Cho, Eunjoon and Cardie, Claire},
  booktitle={Proceedings of the NAACL-HLT},
  year={2021},
  url={https://arxiv.org/abs/2010.12757}
}

License

ACCENTOR is released under CC-BY-SA-4.0, see LICENSE for details.

Owner
Facebook Research
Facebook Research
This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to generate a dynamic forecast from your own data.

📈 Automated Time Series Forecasting Background: This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to gene

Zach Renwick 42 Jan 04, 2023
Github Traffic Insights as Prometheus metrics.

github-traffic Github Traffic collects your repository's traffic data and exposes it as Prometheus metrics. Grafana dashboard that displays the metric

Grafana Labs 34 Oct 27, 2022
Python script to download the celebA-HQ dataset from google drive

download-celebA-HQ Python script to download and create the celebA-HQ dataset. WARNING from the author. I believe this script is broken since a few mo

133 Dec 21, 2022
ChainerRL is a deep reinforcement learning library built on top of Chainer.

ChainerRL and PFRL ChainerRL (this repository) is a deep reinforcement learning library that implements various state-of-the-art deep reinforcement al

Chainer 1.1k Jan 01, 2023
As a part of the HAKE project, includes the reproduced SOTA models and the corresponding HAKE-enhanced versions (CVPR2020).

HAKE-Action HAKE-Action (TensorFlow) is a project to open the SOTA action understanding studies based on our Human Activity Knowledge Engine. It inclu

Yong-Lu Li 94 Nov 18, 2022
Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexan

Phan Nguyen 1 Dec 16, 2021
Train a deep learning net with OpenStreetMap features and satellite imagery.

DeepOSM Classify roads and features in satellite imagery, by training neural networks with OpenStreetMap (OSM) data. DeepOSM can: Download a chunk of

TrailBehind, Inc. 1.3k Nov 24, 2022
A CNN implementation using only numpy. Supports multidimensional images, stride, etc.

A CNN implementation using only numpy. Supports multidimensional images, stride, etc. Speed up due to heavy use of slicing and mathematical simplification..

2 Nov 30, 2021
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022
Elastic weight consolidation technique for incremental learning.

Overcoming-Catastrophic-forgetting-in-Neural-Networks Elastic weight consolidation technique for incremental learning. About Use this API if you dont

Shivam Saboo 89 Dec 22, 2022
Part-Aware Data Augmentation for 3D Object Detection in Point Cloud

Part-Aware Data Augmentation for 3D Object Detection in Point Cloud This repository contains a reference implementation of our Part-Aware Data Augment

Jaeseok Choi 62 Jan 03, 2023
The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Character Transformations for Non-Autoregressive GEC Tagging Milan Straka, Jakub Náplava, Jana Straková Charles University Faculty of Mathematics and

ÚFAL 5 Dec 10, 2022
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022
🥇Samsung AI Challenge 2021 1등 솔루션입니다🥇

MoT - Molecular Transformer Large-scale Pretraining for Molecular Property Prediction Samsung AI Challenge for Scientific Discovery This repository is

Jungwoo Park 44 Dec 03, 2022
A small library of 3D related utilities used in my research.

utils3D A small library of 3D related utilities used in my research. Installation Install via GitHub pip install git+https://github.com/Steve-Tod/util

Zhenyu Jiang 8 May 20, 2022
A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️

hf-hub-lightning A callback for pushing lightning models to the Hugging Face Hub. Note: I made this package for myself, mostly...if folks seem to be i

Nathan Raw 27 Dec 14, 2022
Scripts used to make and evaluate OpenAlex's concept tagging model

openalex-concept-tagging This repository contains all of the code for getting the concept tagger up and running. To learn more about where this model

OurResearch 18 Dec 09, 2022
UniFormer - official implementation of UniFormer

UniFormer This repo is the official implementation of "Uniformer: Unified Transformer for Efficient Spatiotemporal Representation Learning". It curren

SenseTime X-Lab 573 Jan 04, 2023
Machine Learning automation and tracking

The Open-Source MLOps Orchestration Framework MLRun is an open-source MLOps framework that offers an integrative approach to managing your machine-lea

873 Jan 04, 2023
An implementation of the efficient attention module.

Efficient Attention An implementation of the efficient attention module. Description Efficient attention is an attention mechanism that substantially

Shen Zhuoran 194 Dec 15, 2022