Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Overview

Paradigm Shift in NLP

Welcome to the webpage for "Paradigm Shift in Natural Language Processing". Some resources of the paper are constantly maintained here, such as a full list of papers of paradigm shift, an interactive Sankey diagram to depict the trend of paradigm shift, etc.

What is paradigm shift?

First of all, what is paradigm, and what is paradigm shift?

Paradigm is the general framework to model a class of tasks. For example, sequence labeling (SeqLab) is a popular paradigm to solve named entity recognition (NER). We summarize the mainstream paradigms that are widely used for common NLP tasks as: Class, Matching, SeqLab, MRC, Seq2Seq, Seq2ASeq, (M)LM.

Paradigm shift is a phenomena of solving a task that is usually solved with some paradigm with another paradigm. For example, Li et al. (2020) uses the MRC paradigm to solve NER, which is previously solved with SeqLab, then we can say that the paradigm of NER shifted from SeqLab to MRC.

The figure below shows the observed shift (or transfer) of the seven paradigms in recent years.

Paradigm shift in NLP tasks

We collect the papers of paradigm shift in the table below, which is an extension of the Table 1 in our original paper. This table will be constantly updated.

Task Class Matching SeqLab MRC Seq2Seq Seq2ASeq (M)LM
TC Kim 2014;
Liu et al. 2016;
Devlin et al. 2019
Chai et al. 2020;
Yin et al. 2020;
Wang et al. 2021;
Yang et al. 2018 Brown et al. 2020;
Schick&Schutze 2021;
Schick&Schutze 2021;
Gao et al. 2021
NLI Devlin et al. 2019 Chen et al. 2017 McCann et al. 2018 Schick&Schutze 2021;
Schick&Schutze 2021;
Gao et al. 2021
NER Xia et al. 2019;
Fisher&Vlachos 2019;
Yu et al. 2020;
Fu et al. 2021
Ma&Hovy 2016;
Lample 2016
Li et al. 2020 Yan et al. 2021 Lample et al. 2016;
Dai et al. 2020
Ma et al. 2021
ABSA Wang et al. 2016 Sun et al. 2019 Mao et al. 2021
Chen et al. 2021
Yan et al. 2021;
Zhang et al. 2021
Li et al. 2021
RE Zeng et al. 2014 Levy et al. 2017;
Li et al. 2019;
Zhao et al. 2020
Han et al. 2021
Summ Zhong et al. 2020 Cheng&Lapata 2016 McCann et al. 2018 Aghajanyan et al. 2021
Parsing Rodríguez&Vilares 2018;
Strzyz et al. 2019;
Vilares&Rodríguez 2020;
Vacareanu et al. 2020;
Gan et al. 2021 Vinyals et al. 2015;
Li et al. 2018;
Rongali et al. 2020
Chen et al. 2014;
Dyer et al. 2015;
Choe&Charniak 2016

Trends

To intuitively depict the trend of paradigm shift in NLP, we also draw an interactive Sankey diagram, which is an extension of the Figure 2 in our original paper. Also, this diagram is constantly updated as the table above changed.

Contributing

This line of research is difficult to be comprehensively surveyed, so welcome any additions, modifications, and suggestions! Please feel free to submit pull request or directly contact me.

Citation

If you find this webpage or the paper helpful to your research, please cite our paper:

@article{sun2021paradigmshift,
  title={Paradigm Shift in Natural Language Processing}, 
  author={Tianxiang Sun and Xiangyang Liu and Xipeng Qiu and Xuanjing Huang},
  journal={arXiv preprint arXiv:2109.12575},
  year={2021}
}
Owner
Tianxiang Sun
@FudanNLP
Tianxiang Sun
Generating Korean Slogans with phonetic and structural repetition

LexPOS_ko Generating Korean Slogans with phonetic and structural repetition Generating Slogans with Linguistic Features LexPOS is a sequence-to-sequen

Yeoun Yi 3 May 23, 2022
Mirco Ravanelli 2.3k Dec 27, 2022
PortaSpeech - PyTorch Implementation

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

Keon Lee 276 Dec 26, 2022
Samantha, A covid-19 information bot which will provide basic information about this pandemic in form of conversation.

Covid-19-BOT Samantha, A covid-19 information bot which will provide basic information about this pandemic in form of conversation. This bot uses torc

Neeraj Majhi 2 Nov 05, 2021
The (extremely) naive sentiment classification function based on NBSVM trained on wisesight_sentiment

thai_sentiment The naive sentiment classification function based on NBSVM trained on wisesight_sentiment วิธีติดตั้ง pip install thai_sentiment==0.1.3

Charin 7 Dec 08, 2022
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x using fastT5.

Reduce T5 model size by 3X and increase the inference speed up to 5X. Install Usage Details Functionalities Benchmarks Onnx model Quantized onnx model

Kiran R 399 Jan 05, 2023
Reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer: Self-Attention with Linear Complexity)

Linear Multihead Attention (Linformer) PyTorch Implementation of reproducing the Linear Multihead Attention introduced in Linformer paper (Linformer:

Kui Xu 58 Dec 23, 2022
🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy

floret: fastText + Bloom embeddings for compact, full-coverage vectors with spaCy floret is an extended version of fastText that can produce word repr

Explosion 222 Dec 16, 2022
Adversarial Examples for Extreme Multilabel Text Classification

Adversarial Examples for Extreme Multilabel Text Classification The code is adapted from the source codes of BERT-ATTACK [1], APLC_XLNet [2], and Atte

1 May 14, 2022
Voilà turns Jupyter notebooks into standalone web applications

Rendering of live Jupyter notebooks with interactive widgets. Introduction Voilà turns Jupyter notebooks into standalone web applications. Unlike the

Voilà Dashboards 4.5k Jan 03, 2023
Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

PLBART Code pre-release of our work, Unified Pre-training for Program Understanding and Generation accepted at NAACL 2021. Note. A detailed documentat

Wasi Ahmad 138 Dec 30, 2022
spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

342 Nov 21, 2022
Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data

Traditional Chinese Text Recognition Dataset: Synthetic Dataset and Labeled Data Authors: Yi-Chang Chen, Yu-Chuan Chang, Yen-Cheng Chang and Yi-Ren Ye

Yi-Chang Chen 5 Dec 15, 2022
⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡

Translations 🇩🇪 DE 🇫🇷 FR 🇭🇺 HU 🇮🇩 ID 🇮🇹 IT 🇳🇱 NL 🇧🇷 PT-BR 🇷🇺 RU 🇨🇳 ZH ➡️ Documentation | Discord | Installation Guide ⬅️ Fully autom

11.2k Jan 05, 2023
Python utility library for compositing PDF documents with reportlab.

pdfdoc-py Python utility library for compositing PDF documents with reportlab. Installation The pdfdoc-py package can be installed directly from the s

Michael Gale 1 Jan 06, 2022
Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

rJAM splitscreen message reader for MysticBBS A46+

Robbert Langezaal 4 Nov 22, 2022
An open-source NLP library: fast text cleaning and preprocessing.

An open-source NLP library: fast text cleaning and preprocessing

Iaroslav 21 Mar 18, 2022
Generate vector graphics from a textual caption

VectorAscent: Generate vector graphics from a textual description Example "a painting of an evergreen tree" python text_to_painting.py --prompt "a pai

Ajay Jain 97 Dec 15, 2022
NLP - Machine learning

Flipkart-product-reviews NLP - Machine learning About Product reviews is an essential part of an online store like Flipkart’s branding and marketing.

Harshith VH 1 Oct 29, 2021