PyTorch implementation of the paper: Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Last update: Dec 14, 2022

Related tags

Text Data & NLP ProSLU

Overview

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

This repository contains the official PyTorch implementation of the paper:

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding. Xiao Xu*, Libo Qin*, Kaiji Chen, Guoxing Wu, Linlin Li, Wanxiang Che. AAAI 2022. [Paper(Arxiv)] [Paper]

If you use any source codes or the datasets included in this toolkit in your work, please cite the following paper. The bibtex are listed below:

...

In the following, we will guide you how to use this repository step by step.

Workflow

Architecture

Results

Preparation

Our code is based on the following packages:

numpy==1.19.5
tqdm==4.50.2
pytorch==1.7.0
python==3.7.3
cudatoolkit==11.0.3
transformers==4.1.1

We highly suggest you using Anaconda to manage your python environment.

We download the chinese pretrained model checkpoints from the following links:

How to Run it

The script train.py acts as a main function to the project, you can run the experiments by the following commands.

# LSTM w/o Profile on TITAN Xp
python train.py -g -fs -es -uf -bs 8 -lr 0.0006
# LSTM w/ Profile on TITAN Xp
python train.py -g -fs -es -uf -ui -bs 8 -lr 0.0004
# BERT w/o Profile on Tesla V100s PCIE 32GB
python train.py -g -fs -es -uf -up -mt XLNet -bs 8 -lr 0.001 -blr 4e-05
# BERT w/ Profile on Tesla V100 PCIE 32GB
python train.py -g -fs -es -uf -up -ui -mt ELECTRA -bs 8 -lr 0.0008 -blr 4e-05

If you have any question, please issue the project or email me or lbqin, and we will reply you soon.

Acknowledgement

We are highly grateful for the public code of Stack-Propagation!

A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding. Libo Qin,Wanxiang Che, Yangming Li, Haoyang Wen and Ting Liu. (EMNLP 2019). Long paper. [pdf] [code]
We are highly grateful for the open-source knowledge graph!
- CN-DBpedia
- OwnThink

PyTorch implementation of the paper: Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Related tags

Overview

Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding

Workflow

Architecture

Results

Preparation

How to Run it

Acknowledgement

Owner

Xiao Xu

Dé op-de-vlucht Pieton vertaler. Wereldwijd gebruikt door meer dan 1.000+ succesvolle bedrijven!

Code for EMNLP20 paper: "ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training"

Code of paper: A Recurrent Vision-and-Language BERT for Navigation

Code for Findings of ACL 2022 Paper "Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors"

使用Mask LM预训练任务来预训练Bert模型。训练垂直领域语料的模型表征，提升下游任务的表现。

LeBenchmark: a reproducible framework for assessing SSL from speech

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Watson Natural Language Understanding and Knowledge Studio

Based on 125GB of data leaked from Twitch, you can see their monthly revenues from 2019-2021

A library for Multilingual Unsupervised or Supervised word Embeddings

Long text token classification using LongFormer

Google AI 2018 BERT pytorch implementation

Sentello is python script that simulates the anti-evasion and anti-analysis techniques used by malware.

Tools to download and cleanup Common Crawl data

Entity Disambiguation as text extraction (ACL 2022)

MRC approach for Aspect-based Sentiment Analysis (ABSA)

Open solution to the Toxic Comment Classification Challenge

A demo of chinese asr

Finally decent dictionaries based on Wiktionary for your beloved eBook reader.