I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

Last update: Jan 13, 2022

Overview

Sentiment-of-movie-reviews

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others make this task very challenging.

This project uses datasets available on kaggle for training and testing.

Transformers brings all these models together and makes it very easy to use each with only a few lines of code. In fact they even provide us with cool tools like pipelines or live demo that we can classify our text without any training or long periods of coding. But as you can geuss these simple and ready to use models have their weaknesses. For example, you can't classify the text with them with the number of labels you want because they've been pretrained on a text with specific labels. Also not all models used by them are as strong and accurate as we want them to be(for example the default model for sentiment analysis is uncased distillbert which is not the best model we can find out there). With all these in mind, we want to train .Transformers models on our own data with the models that we prefer.

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

Related tags

Overview

Sentiment-of-movie-reviews

Owner

Implementation of some unbalanced loss like focal_loss, dice_loss, DSC Loss, GHM Loss et.al

This repository has a implementations of data augmentation for NLP for Japanese.

Translation to python of Chris Sims' optimization function

Conditional Transformer Language Model for Controllable Generation

基于Transformer的单模型、多尺度的VAE模型

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Bnagla hand written document digiiztion

The proliferation of disinformation across social media has led the application of deep learning techniques to detect fake news.

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

A modular Karton Framework service that unpacks common packers like UPX and others using the Qiling Framework.

STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums.

Klexikon: A German Dataset for Joint Summarization and Simplification

Code for "Parallel Instance Query Network for Named Entity Recognition", accepted at ACL 2022.

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

This is an incredibly powerful calculator that is capable of many useful day-to-day functions.

Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

Label data using HuggingFace's transformers and automatically get a prediction service

EdiTTS: Score-based Editing for Controllable Text-to-Speech

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

Trex is a tool to match semantically similar functions based on transfer learning.