A Structured Self-attentive Sentence Embedding

Last update: Nov 28, 2022

Overview

Structured Self-attentive sentence embeddings

Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR 2017: https://arxiv.org/abs/1703.03130 .

USAGE:

For binary sentiment classification on imdb dataset run : python classification.py "binary"

For multiclass classification on reuters dataset run : python classification.py "multiclass"

You can change the model parameters in the model_params.json file Other tranining parameters like number of attention hops etc can be configured in the config.json file.

If you want to use pretrained glove embeddings , set the use_embeddings parameter to "True" ,default is set to False. Do not forget to download the glove.6B.50d.txt and place it in the glove folder.

Implemented:

Classification using self attention
Regularization using Frobenius norm
Gradient clipping
Visualizing the attention weights

Instead of pruning ,used averaging over the sentence embeddings.

Visualization:

After training, the model is tested on 100 test points. Attention weights for the 100 test data are retrieved and used to visualize over the text using heatmaps. A file visualization.html gets saved in the visualization/ folder after successful training. The visualization code was provided by Zhouhan Lin (@hantek). Many thanks.

Below is a shot of the visualization on few datapoints.

Training accuracy 93.4% Tested on 1000 points with 90.2% accuracy

A Structured Self-attentive Sentence Embedding

Related tags

Overview

Structured Self-attentive sentence embeddings

USAGE:

Implemented:

Visualization:

Owner

Kaushal Shetty

An end to end ASR Transformer model training repo

A Practitioner's Guide to Natural Language Processing

SentAugment is a data augmentation technique for semi-supervised learning in NLP.

Indonesia spellchecker with python

Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2021).

A PyTorch-based model pruning toolkit for pre-trained language models

Python api wrapper for JellyFish Lights

Basic Utilities for PyTorch Natural Language Processing (NLP)

The first online catalogue for Arabic NLP datasets.

Mycroft Core, the Mycroft Artificial Intelligence platform.

PUA Programming Language written in Python.

State of the art faster Natural Language Processing in Tensorflow 2.0 .

Pipelines de datos, 2021.

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

An implementation of the Pay Attention when Required transformer

Tools for curating biomedical training data for large-scale language modeling

FastFormers - highly efficient transformer models for NLU

NLP, Machine learning

Natural Language Processing at EDHEC, 2022

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing