Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Last update: Aug 04, 2022

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

download the 'resources.zip' file here: https://drive.google.com/file/d/1X88cMrLVpAcJd5Z4Gg6MfTLclIuGF-d6/view?usp=sharing
extract the content of 'resources.zip'

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Execute the following command to train and evaluate the model. The evaluation results are saved under the folder 'results'.

python main.py -c config.json

Optimizing Hyperparameters

The "config.json" file contains hyperparameters that can be changed to train different variants of the model.

{
  "base_dir": "",
  "batch_size": 64,
  "epochs": 20,
  "epoch_patience": 5,
  "bert_model_dir": "resources/hatebert",
  "monitor": "loss",
  "tweet_text_seq_len": 80,
  "tweet_text_char_len": 128,
  "char_size": 29,
  "max_learning_rate": 0.001,
  "end_learning_rate": 0.0000001,
  "rnn_type": "lstm",
  "rnn_layer_size": 200,
  "text_models": ["char_emb", "bert", "hate_words"],
  "normalize_text": true,
  "dataset_year": "2021",
  "optimizer": "adam",
  "text_use_attention": false,
  "oversample": true,
  "feature_normalization_layer_size": 512,
  "min_feature_normalization_layer_size": 64
}

bert_model_dir

"bert_model_dir": "resources/hatebert"
     OR
"bert_model_dir": "resources/bert-base"

dataset_year

"dataset_year": "2019"
	OR
"dataset_year": "2020"
	OR
"dataset_year": "2021"

text_models

"text_models": ["hate_words"]
	OR
"text_models": ["bert"]
	OR
"text_models": ["char_emb"]
	OR
"text_models": ["char_emb", "bert", "hate_words"]

rnn_type

"rnn_type": "lstm"
	OR
"rnn_type": "gru"
	OR
"rnn_type": "bi-gru"

Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

Related tags

Overview

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HASOC 2021 English Subtask 1A.

Publication

Installation (requires >=Python 3.6 )

Training and Evaluation on HASOC datasets (2019, 2020, 2021)

Optimizing Hyperparameters

Owner

Sherzod Hakimov

HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records

The Turing Change Point Detection Benchmark: An Extensive Benchmark Evaluation of Change Point Detection Algorithms on real-world data

Semi-supervised Implicit Scene Completion from Sparse LiDAR

Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

EdiBERT, a generative model for image editing

Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Repo for the ACMMM20 submission: "Personalized breath based biometric authentication with wearable multimodality".

A certifiable defense against adversarial examples by training neural networks to be provably robust

PyTorch and GPyTorch implementation of the paper "Conditioning Sparse Variational Gaussian Processes for Online Decision-making."

Code release for ICCV 2021 paper "Anticipative Video Transformer"

Training BERT with Compute/Time (Academic) Budget

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

Code for Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding

Pytorch Implementation of Interaction Networks for Learning about Objects, Relations and Physics

A novel pipeline framework for multi-hop complex KGQA task. About the paper title: Improving Multi-hop Embedded Knowledge Graph Question Answering by Introducing Relational Chain Reasoning

MultiTaskLearning - Multi Task Learning for 3D segmentation

Trash Sorter Extraordinaire is a software which efficiently detects the different types of waste in a pile of random trash through feeding it pictures or videos.

An end-to-end PyTorch framework for image and video classification

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

A repo that contains all the mesh keys needed for mesh backend, along with a code example of how to use them in python