Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

Last update: Sep 17, 2022

Overview

XLM-EMO: Multilingual Emotion Prediction in Social Media Text

Abstract

Detecting emotion in text allows social and computational scientists to study how people behave and react to online events. However, developing these tools for different languages requires data that is not always available. This paper collects the available emotion detection datasets across 19 languages. We train a multilingual emotion prediction model for social media data, XLM-EMO. The model shows competitive performance in a zero-shot setting, suggesting it is helpful in the context of low-resource languages. We release our model to the community so that interested researchers can directly use it.

See the paper for additional details:

Bianchi, F., Nozza, & D., Hovy. "XLM-EMO: Multilingual Emotion Prediction in Social Media Text". In Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (Forthcoming). Association for Computational Linguistics, 2022. Link.

Free software: MIT license

Installing

pip install -U xlm-emo

Important: If you want to use CUDA you need to install the correct version of the CUDA systems that matches your distribution, see PyTorch.

Features

from xlm_emo.classifier import  EmotionClassifier
ec = EmotionClassifier()

ec.predict(["senti testa di cazzo", "I am very happy"])

>> ["anger", "joy"]

Models

Model	Link	Macro F1 on Test Set
XLM-EMO-T	https://huggingface.co/MilaNLProc/xlm-emo-t	0.85
XLM-EMO-B	TBD	TBD
XLM-EMO-L	TBD	TBD

Reference

If you use this tool please cite the following paper:

@inproceedings{bianchi-etal-2022-xlmemo,
title = {{XLM-EMO}: Multilingual Emotion Prediction in Social Media Text},
author = "Bianchi, Federico and Nozza, Debora and Hovy, Dirk",
booktitle = "Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis",
year = "2022",
publisher = "Association for Computational Linguistics"
}

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Multilingual Emotion classification using BERT (fine-tuning). Published at the WASSA workshop (ACL2022).

Related tags

Overview

XLM-EMO: Multilingual Emotion Prediction in Social Media Text

Abstract

Installing

Features

Models

Reference

Credits

Owner

MilaNLP

The first online catalogue for Arabic NLP datasets.

FewCLUE: 为中文NLP定制的小样本学习测评基准

Score-Based Point Cloud Denoising (ICCV'21)

ProtFeat is protein feature extraction tool that utilizes POSSUM and iFeature.

DVC-NLP-Simple-usecase

A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

Converts text into a PDF of handwritten notes

Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.

leaking paid token generator that was a shit lmao for 100$ haha

Maha is a text processing library specially developed to deal with Arabic text.

Code for paper: An Effective, Robust and Fairness-awareHate Speech Detection Framework

Document processing using transformers

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

Transformers-regression - Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates

Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

LightSeq: A High-Performance Inference Library for Sequence Processing and Generation

SGMC: Spectral Graph Matrix Completion