In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.

Last update: Jan 03, 2022

Overview

Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews

Abstract

Sentiment analysis has made great progress in recent years, due to the fact that companies want to have a better understanding of how their products are classified by their consumers. However, despite the great advances that emerge in the field of artificial intelligence to solve this task, the most robust models are found in the English language. In the present work, we compare two Artificial Intelligence models that have monolingual and Multilingual approaches, which are Spanish BERT and Multilingual BERT, models based on BERT's transformer Architecture, to which the fine tuned technique was applied for the task of Sentiment analysis on the Amazon reviews dataset in Spanish using the accuracy and F1 score metrics. Finally, it was found that the Spanish BERT model has the best results for the sentiment analysis task on the Amazon reviews dataset in Spanish.

this paper is available here

Pipeline

Prerequisites

Linux / Window
Python3

Clone this Repository

git clone https://github.com/alexliqu09/Sentiment-Analysis-on-Amazon-Reviews.git

Train model

If you want to train the models use the colab Notebooks

Beto
MBert

Run the work in local

If you want to proof the work , you should run the following commands:

First , Install requeriments file:

pip install -r requeriments.txt

Second , download the Weights of Beto & MBERT and put them in this directory
Third , Start Streamlit server:

streamlit run main.py

Note:

Local host : http://localhost:8501 
Network URL:  http://192.168.0.5:8501

Run with Docker 🐋

#Bulding docker image 

docker build -t bert .

#RUN container
docker run -t -p 5000:5000 --name betocontainer bert

open http://172.17.0.2:8501

If you find useful our work , please cite this paper:

@inproceedings{@lvrBERT,
  title={Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews},
  author={Lique, Alexander and Vásquez, Diego and Rios, Manuel },
  year={2021}
}

In this project, we compared Spanish BERT and Multilingual BERT in the Sentiment Analysis task.

Related tags

Overview

Applying BERT Fine Tuning to Sentiment Classification on Amazon Reviews

Abstract

Pipeline

Prerequisites

Clone this Repository

Train model

Run the work in local

Run with Docker 🐋

Owner

Alexander Leonardo Lique Lamas

A music comments dataset, containing 39,051 comments for 27,384 songs.

Toy example of an applied ML pipeline for me to experiment with MLOps tools.

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

A high-level yet extensible library for fast language model tuning via automatic prompt search

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Mastering Transformers, published by Packt

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Comprehensive-E2E-TTS - PyTorch Implementation

Tensorflow implementation of paper: Learning to Diagnose with LSTM Recurrent Neural Networks.

The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

Deep learning for NLP crash course at ABBYY.

NVDA, the free and open source Screen Reader for Microsoft Windows

Repository for the paper: VoiceMe: Personalized voice generation in TTS

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Wake: Context-Sensitive Automatic Keyword Extraction Using Word2vec

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.

Code for using and evaluating SpanBERT.