This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

Overview

Aspect_Based_Sentiment_Extraction

Created on: 5th Jan, 2022.

This project deals with an important field of Natural Lnaguage Processing - Aspect Based Sentiment Analysis (ABSA). But the problem statement here is rather a simplified version of the more general ABSA.
Aspect-Based Sentiment analysis is a type of text analysis that categorizes opinions by aspect and identifies the sentiment related to each aspect. Aspects are important words that are of importance to a business or organization, where they want to be able to provide their customers with insights on how their customers feel about these important words.
The general ABSA problem, which is an active area of machine learning research, is about finding all the possible aspects and the corresponding sentiments associated with those aspects in a given text or a document. For example, given a sentence like “I like apples very much, but I hate kiwi”, an ideal absa system should be able to identify aspects like apples and kiwi with correct sentiments of positive and negative respectively.
But here, in the problem statement that this project deals with, an aspect word/phrase is already given from the given text, which means that our problem is rather simplified and we don’t need to worry about the complex task of identifying aspects as well in the text, at least for this problem statement that I am dealing with. In future, I will be working with the more general version of this problem, where aspects are also needed to be indentified.


A brief description of approach

This article explores the use of a pre-trained language model, BERT (Bidirectional Encoder Representaton from Transformers), for the purpose of solving the aforementioned problem. BERT offers very robust contextual embeddings which are useful to solve the variety of problems. Therefore, the sole idea here is to explore the modelling capabilities of the BERT embeddings, by making use of the sentence pair input for the aspect sentiment prediction task. The model which I came up with was able to achieve 99.40% accuracy on the training data and 96.16% accuracy on the test data.

Instructions to run and test files

Clone this repository and navigate to the project folder:
git clone https://github.com/stardust-88/Aspect_Based_Sentiment_Extraction.git
cd Aspect_Based_sentiment_Extraction

To install the dependencies:
pip3 install -r requirements.txt

To train:
Navigate to the src folder and run the below command:
python train.py

For inference:
Navigate to the src folder and run the below command:
python inference.py

Instructions for using trained model weights

I have saved my trained weights to google drive and generated the link, which can be used to download the same. This can be done through below steps.

  1. Navigate to the the models directory.
  2. When inside the models directory, run the file download_model.py: python download_model.py

So, if the user wants to do the inference using pre-trained weights, first download the weights following above two steps, then then run the inference.py script.

Results from the model

  1. Accuracy curve:

  1. Loss curve:

  1. Classification report:

  1. Confusion matrix:

Owner
Naman Rastogi
An undergraduate in Computer Science and Engineering. Trying to discover fundamental patterns with machine learning.
Naman Rastogi
Mastering Transformers, published by Packt

Mastering Transformers This is the code repository for Mastering Transformers, published by Packt. Build state-of-the-art models from scratch with adv

Packt 195 Jan 01, 2023
This is a really simple text-to-speech app made with python and tkinter.

Tkinter Text-to-Speech App by Souvik Roy This is a really simple tkinter app which converts the text you have entered into a speech. It is created wit

Souvik Roy 1 Dec 21, 2021
NLP, before and after spaCy

textacy: NLP, before and after spaCy textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the hig

Chartbeat Labs Projects 2k Jan 04, 2023
Code for ACL 2020 paper "Rigid Formats Controlled Text Generation"

SongNet SongNet: SongCi + Song (Lyrics) + Sonnet + etc. @inproceedings{li-etal-2020-rigid, title = "Rigid Formats Controlled Text Generation",

Piji Li 212 Dec 17, 2022
OceanScript is an Esoteric language used to encode and decode text into a formulation of characters

OceanScript is an Esoteric language used to encode and decode text into a formulation of characters - where the final result looks like waves in the ocean.

Repository of the Code to Chatbots, developed in Python

Description In this repository you will find the Code to my Chatbots, developed in Python. I'll explain the structure of this Repository later. Requir

Li-am K. 0 Oct 25, 2022
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Welcome to Spokestack Python! This library is intended for developing voice interfaces in Python. This can include anything from Raspberry Pi applicat

Spokestack 133 Sep 20, 2022
Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

CvarAdversarialRL Official code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning". Initial setup Create a virtual

Mathieu Godbout 1 Nov 19, 2021
A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

Multilingual Latent Dirichlet Allocation (LDA) Pipeline This project is for text clustering using the Latent Dirichlet Allocation (LDA) algorithm. It

Artifici Online Services inc. 74 Oct 07, 2022
Implementation of TTS with combination of Tacotron2 and HiFi-GAN

Tacotron2-HiFiGAN-master Implementation of TTS with combination of Tacotron2 and HiFi-GAN for Mandarin TTS. Inference In order to inference, we need t

SunLu Z 7 Nov 11, 2022
GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning

GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning GrammarTagger is an open-source toolkit for grammatical profiling for lan

Octanove Labs 27 Jan 05, 2023
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

RITA DSL This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy co

Šarūnas Navickas 60 Sep 26, 2022
Repository for Project Insight: NLP as a Service

Project Insight NLP as a Service Contents Introduction Features Installation Setup and Documentation Project Details Demonstration Directory Details H

Abhishek Kumar Mishra 286 Dec 06, 2022
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation In this repo you can find the code of the Supervised Hybrid Audio Segmentatio

Machine Translation @ UPC 21 Dec 20, 2022
Neural-Machine-Translation - Implementation of revolutionary machine translation models

Neural Machine Translation Framework: PyTorch Repository contaning my implementa

Utkarsh Jain 1 Feb 17, 2022
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

537 Jan 05, 2023
NLP Text Classification

多标签文本分类任务 近年来随着深度学习的发展,模型参数的数量飞速增长。为了训练这些参数,需要更大的数据集来避免过拟合。然而,对于大部分NLP任务来说,构建大规模的标注数据集非常困难(成本过高),特别是对于句法和语义相关的任务。相比之下,大规模的未标注语料库的构建则相对容易。为了利用这些数据,我们可以

Jason 1 Nov 11, 2021
Journalism AI – Quotes extraction for modular journalism

Quote extraction for modular journalism (JournalismAI collab 2021)

Journalism AI collab 2021 207 Dec 25, 2022
Finally, some decent sample sentences

tts-dataset-prompts This repository aims to be a decent set of sentences for people looking to clone their own voices (e.g. using Tacotron 2). Each se

hecko 19 Dec 13, 2022
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained models

Hugging Face 77.1k Dec 31, 2022