This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

Last update: Jan 01, 2023

Overview

Aspect_Based_Sentiment_Extraction

Created on: 5th Jan, 2022.

This project deals with an important field of Natural Lnaguage Processing - Aspect Based Sentiment Analysis (ABSA). But the problem statement here is rather a simplified version of the more general ABSA.
Aspect-Based Sentiment analysis is a type of text analysis that categorizes opinions by aspect and identifies the sentiment related to each aspect. Aspects are important words that are of importance to a business or organization, where they want to be able to provide their customers with insights on how their customers feel about these important words.
The general ABSA problem, which is an active area of machine learning research, is about finding all the possible aspects and the corresponding sentiments associated with those aspects in a given text or a document. For example, given a sentence like “I like apples very much, but I hate kiwi”, an ideal absa system should be able to identify aspects like apples and kiwi with correct sentiments of positive and negative respectively.
But here, in the problem statement that this project deals with, an aspect word/phrase is already given from the given text, which means that our problem is rather simplified and we don’t need to worry about the complex task of identifying aspects as well in the text, at least for this problem statement that I am dealing with. In future, I will be working with the more general version of this problem, where aspects are also needed to be indentified.

A brief description of approach

This article explores the use of a pre-trained language model, BERT (Bidirectional Encoder Representaton from Transformers), for the purpose of solving the aforementioned problem. BERT offers very robust contextual embeddings which are useful to solve the variety of problems. Therefore, the sole idea here is to explore the modelling capabilities of the BERT embeddings, by making use of the sentence pair input for the aspect sentiment prediction task. The model which I came up with was able to achieve 99.40% accuracy on the training data and 96.16% accuracy on the test data.

Instructions to run and test files

Clone this repository and navigate to the project folder:
git clone https://github.com/stardust-88/Aspect_Based_Sentiment_Extraction.git
cd Aspect_Based_sentiment_Extraction

To install the dependencies:
pip3 install -r requirements.txt

To train:
Navigate to the src folder and run the below command:
python train.py

For inference:
Navigate to the src folder and run the below command:
python inference.py

Instructions for using trained model weights

I have saved my trained weights to google drive and generated the link, which can be used to download the same. This can be done through below steps.

Navigate to the the models directory.
When inside the models directory, run the file download_model.py: python download_model.py

So, if the user wants to do the inference using pre-trained weights, first download the weights following above two steps, then then run the inference.py script.

Results from the model

Accuracy curve:

Loss curve:

Classification report:

Confusion matrix:

This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

Related tags

Overview

Aspect_Based_Sentiment_Extraction

A brief description of approach

Instructions to run and test files

Instructions for using trained model weights

Results from the model

Owner

Naman Rastogi

HF's ML for Audio study group

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

LOT: A Benchmark for Evaluating Chinese Long Text Understanding and Generation

The training code for the 4th place model at MDX 2021 leaderboard A.

A Chinese to English Neural Model Translation Project

A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Repository for the paper "Optimal Subarchitecture Extraction for BERT"

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

MEDIALpy: MEDIcal Abbreviations Lookup in Python

PyTranslator é simultaneamente um editor e tradutor de texto com diversos recursos e interface feito com coração e 100% em Python

Entity Disambiguation as text extraction (ACL 2022)

DLO8012: Natural Language Processing & CSL804: Computational Lab - II

DziriBERT: a Pre-trained Language Model for the Algerian Dialect

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE

Fully featured implementation of Routing Transformer

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

MRC approach for Aspect-based Sentiment Analysis (ABSA)

RuCLIP tiny (Russian Contrastive Language–Image Pretraining) is a neural network trained to work with different pairs (images, texts).

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch