How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

Last update: Jun 05, 2022

Related tags

Overview

This repo contains codes for the following paper:

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.
Aditya Shah, Chandresh Kumar Maurya, In Proceedings of the 18th International Conference on Natural Language Processing - (ACL 2021).

The presentation slides are available here

Requirements

Python 3.6 or higher
Pytorch >= 1.3.0
Pytorch_transformers (also known as transformers)
Pandas, Numpy, Pickle
Fasttext

Download the fasttext embed file:

The fasttext embedding file can be obtained here

Dataset

We release the benchmark sarcasm dataset for Hinglish language to facilitate further research on code-mix NLP.

We create a dataset using TweetScraper built on top of scrapy to extract code-mix hindi-english tweets. We pass search tags like #sarcasm, #humor, #bollywood, #cricket, etc., combined with most commonly used code-mix Hindi words as query. All the tweets with hashtags like #sarcasm, #sarcastic, #irony, #humor etc. are treated as positive. Non sarcastic tweets are extracted using general hashtags like #politics, #food, #movie, etc. The balanced dataset comprises of 166K tweets.

Finally, we preprocess and clean the data by removing urls, hashtags, mentions, and punctuation in the data. The respective files can be found here as train.csv, val.csv, and test.csv

Arguments:

--epochs:  number of total epochs to run, default=10

--batch-size: train batchsize, default=2

--lr: learning rate for the model, default=5.16e-05

--hidden_size_lstm: hidden size of lstm, default=1024

--hidden_size_linear: hidden size of linear layer, default=128

--seq_len: sequence lenght of input text, default=56

--clip: gradient clipping, default=0.218

--dropout: dropout value, default=0.198

--num_layers: number of lstm layers, default=1

--lstm_bidirectional: bidirectional lstm, default=False

--fasttext_embed_file: path to fasttext embedding file, default='new_hing_emb'

--train_dir: path to train file, default='train.csv'

--valid_dir: path to validation file, default='valid.csv'

--test_dir: path to test file, default='test.csv'

--checkpoint_dir: path to the saved, default='selfnet.pt'

--test: testing the model, default=False

Train

python main.py

Test

python main.py --test True

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection.

Related tags

Overview

Requirements

Download the fasttext embed file:

Dataset

Arguments:

Train

Test

Owner

Betafold - AlphaFold with tunings

DL course co-developed by YSDA, HSE and Skoltech

AugLiChem - The augmentation library for chemical systems.

RATCHET is a Medical Transformer for Chest X-ray Diagnosis and Reporting

PyTorch implementation of adversarial patch

Neural Message Passing for Computer Vision

In this work, we will implement some basic but important algorithm of machine learning step by step.

Image inpainting using Gaussian Mixture Models

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks

Using Streamlit to host a multi-page tool with model specs and classification metrics, while also accepting user input values for prediction.

PyTorch Implementation of Exploring Explicit Domain Supervision for Latent Space Disentanglement in Unpaired Image-to-Image Translation.

D-NeRF: Neural Radiance Fields for Dynamic Scenes

This is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks

Code for paper: "Spinning Language Models for Propaganda-As-A-Service"

Implementation of average- and worst-case robust flatness measures for adversarial training.

Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

SuRE Evaluation: A Supplementary Material

Implementation of our recent paper, WOOD: Wasserstein-based Out-of-Distribution Detection.

Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning